2.4. BAHD-type malonyltransferase genes from Digitalis lanata

Focusing once again on the role of 21- O -malonylation in cardenolide formation in Digitalis, we performed a function-based search for BAHD-type malonyltransferases in the transcriptome database of D. purpurea (https://medicinalplantgenomics.msu.edu). This search resulted in 9 hits. The respective sequences were annotated as either putative malonyl-CoA:anthocyanin 5- O -glucoside-6 ′′ - O -malonyltransferases or putative quercetin 3- O -glucoside 6 ′′ - O -malonyltransferases. The sequences were checked for completeness by protein BLASTs using the NCBI databases (Altschul et al., 1990). Finally, two promising candidates were identified, both annotated as quercetin 3- O -glucoside-6 ′′ - O -malonyltransferases, encoding for proteins with molecular masses of 47 kDa and 50 kDa, respectively.

The first sequence isolated from D. lanata, termed Dlmat1 (GenBank MT992078), contained 1374 nucleotides coding for 457 amino acids and had a molecular mass of 51.0 kDa. Dlmat1 was 94% identical to a sequence in the D. purpurea genome annotated as quercetin 3- O -glucoside-6 ′′ - O -malonyltransferase. BLAST searches also revealed high sequence identities with BAHD-type malonyltransferases from Dorcoceras hygrometricum Bunge and Sesamum indicum L. It was confirmed that the four motifs conserved among all members of the BAHD enzyme superfamily (D’ Auria, 2006; Tuominen et al., 2011) are also present in Dl MaT1. The clade-Ia-specific motif LTFFD (Tuominen et al., 2011) as well as the anthocyanin metabolism related motif NYFGNC (Yu et al., 2009) were also identified. Based on this isolated sequence we were able to isolate three further genes coding for putative BAHD-type malonyltransferases (Dlmat2, GenBank MW013542; Dlmat3, GenBank MW013543; Dlmat4, GenBank MW013544), sharing nucleotide sequence identities ranging from 49.6% to 92.1% and amino acid sequence identities from 32.1% to 93.4% with each other and Atpmat1. The protein sequences were further analyzed using SignalP (Nielsen, 2017) and DeepLoc-1.0 (Almagro Armenteros et al., 2017). All of them were predicted to be ‘soluble’ proteins as already described for several BAHD-type enzymes (D’ Auria 2006; Bontpart et al., 2015).