Introduction
DNA methylation is the most studied epigenetic modification. Its involvement in gene expression regulation is of great interest, both for the study of physiological mechanisms associated with cell differentiation and function, and for the increasingly evident contribution to disease onset and progression1.
Methylation at cytosines within the CpG dinucleotide has been considered the main site of DNA methylation in mammals and virtually the only one with functional significance2. DNA methylation at other cytosine moieties—usually indicated as “non-CpG” or “CpH” (i.e. methylation at CpA, CpC or CpT cytosines)—was historically thought to be restricted to embryonic tissues and stem cells3. More recently, an increasing number of reports have shown that cytosines at non-CpG sites can also present high methylation levels and that this modification retains a functional role, being associated with gene expression regulation4. Non-CpG DNA methylation seems particularly relevant in brain cells5,6. Our group has contributed to this field, suggesting that a technical bias that arose over time in the bisulfite assay on target sequences causes the underestimation of non-CpG methylation7. This finding was supported by the relevant methylation levels observed at non-CpG sites when genomic techniques, particularly those non-dependent on bisulfite-PCR assay, are used8,9.
Next-generation sequencing (NGS) techniques are increasingly used in methylation studies due to their ability to provide a profile of many differentially methylated sites, possibly associated with specific DNA methylation patterns across the genome10,11. Despite allowing the assessment of complex methylation profiles across the genome and coordinated changes in associated genes, the “omic” approach may be less informative at the level of individual genes, since some of these approaches only analyse a limited number of CpGs (i.e. RRBS, EPIC or other DNA methylation arrays), not necessarily representative of the whole promoter methylation. Therefore, methylation analysis through bisulfite assay followed by Sanger sequencing of specific target gene regions remains the most accurate approach for profiling DNA methylation patterns with single-cytosine resolution in functional epigenetic studies7,12,13. In this workflow, methylation status is inferred by comparing bisulfite-treated sequences to their unmodified counterparts: methylated cytosines remain as C, whereas unmethylated cytosines are converted to T12. This step requires comparing the sequencing output with the original genomic sequence, typically retrieved from GenBank. Originally performed manually, this Sanger-based bisulfite-PCR analysis process was later facilitated by some bioinformatics tools that automate this comparison, thereby speeding up the process and reducing human error14,15,16. However, all these tools were specifically designed to identify CpG methylation only. None of the released tools is capable of aligning sequences and reporting methylation at non-CpG residues, probably due to the historical bias that considered CpG methylation as the only functionally relevant type7.
The lack of tools capable of detecting and quantifying non-CpG methylation at single-base resolution represents a significant barrier to fully understanding its biological roles. To overcome this limitation and build on our previous work reporting relevant non-CpG methylation modulation associated with different genes17,18,19,20, we developed a bioinformatics tool named “MethPy”. To our knowledge, MethPy is the first open-source tool specifically designed to provide a simple and intuitive identification of the methylation status of any cytosine from bisulfite-treated Sanger sequencing data. Developed in Python and freely available at https://dmsp.web.uniroma1.it/it/methpy, it automatically compares the sequence obtained after sequencing of bisulfite-treated amplificated samples with the corresponding genomic reference and reports the methylation status of both CpG and non-CpG sites. Different output formats are supported, including histograms and data sheets, allowing the users to apply modifications, elaborations, calculations and the most opportune statistical analysis depending on the experimental design. By addressing a current methodological gap, MethPy aims to streamline non-CpG methylation analysis and broaden the scope of functional epigenetics research.
Results
MethPy is a package specifically developed for the analysis of DNA sequences subjected to bisulfite conversion, with a particular focus on automated detection of non-CpG methylation—an analysis not supported by currently available software. It allows users to input experimentally obtained sequences and compare them with reference sequences retrieved from public databases. The software also generates tables and graphical outputs that summarize the results.
Before its release, MethPy was tested on control sequences (native, bisulfite-modified, SssI-methylated/bisulfite-modified) as well as on several experimental sequences previously generated in our lab and already “manually” analyzed, showing complete agreement between the results. Specifically, it was compared on previously analyzed methylation status of the human promoter of PSEN1, IL-1β, IL-6, already published18,19,20. Moreover, it was used to compare manual and MethPy-assisted analysis of the human promoters of PICALM and LRP1 (paper in preparation, presented at the 2nd ES International Meeting, 15–17 October 2025, Chicago. Abstract book in preparation on Frontiers in Epigenetics and Epigenomics), and of the genomic sequence of the miRNA-29a21.
This package consists of seven modules and was built following the instructions on the dedicated page: https://dmsp.web.uniroma1.it/it/methpy. The step-by-step guide for the use of the software is present in the “README” section of the application.
Start module
The start module initializes the library by creating essential folders for references, input sequences, analysis outputs (Word and text), tables, and charts, organizing all components required for the package’s operation and downstream analysis.
Ref module
The Ref module allows the user to specify the reference sequence to be used as the basis for the analysis. Upon execution, a pop-up window opens (Fig.1a), where the user can enter the desired sequence and assign it a custom name for identification.
MethPy pop-up windows. The different MethPy modules: (a) Ref module generates two distinct output files. The first contains the sequence in the forward orientation, while the second contains the reverse orientation. These files are distinguished by the suffix F (forward) and R (reverse), respectively. To facilitate input error handling, automatic checks and error messages have been implemented. For example, lowercase characters in the reference sequence are automatically converted to uppercase, and the user is notified if any invalid characters (i.e., other than A, C, G, or T) are detected. b-d) The first three windows displayed during the analysis of the sequence “Tutorial_1F”: (b) sequence selection, (c) identification of the start position, (d) error report.
Tutorial module
The Tutorial module was designed to help users familiarize themselves with the package using simplified sequences. This allows for easier comparison with the reference sequence and facilitates the understanding and visualization of the final steps involved in generating outputs and graphs.
Check module
The Check module compares experimental sequences with reference sequences, reading them from the text files saved by the user in the Input folder. Upon execution, a pop-up window opens (Fig. 1b), allowing the user to select a reference sequence from those previously saved, as well as the path to the sequence to be analyzed, using a drop-down menu. The input file can be selected by clicking the Search button.
To ensure alignment consistency between the start of the two sequences, the package displays a dedicated window (Fig. 1c) showing the values of the respective bases and highlighting possible differences. For each position where the experimental and reference sequences are identical, or where differences are consistent with sodium bisulfite conversion, an “X” is displayed in the bottom line to indicate a match. If the displayed initial region of the sequence is correct, the user can click the Yes button to proceed with the analysis of the remaining sequence. If the alignment is incorrect, the No button can be used to explore alternative starting points. If no appropriate starting point is found, the process can be interrupted. The remainder of the check process is carried out automatically by the module and only stops when it detects differences between the sequence and the reference that cannot be attributed to bisulfite conversion. In such cases, a third type of window appears, prompting the user to specify the type of error found (Fig. 1d). If the sequence is no longer recognized for any reason, the user can terminate the process by clicking Stop.
At the end of the analysis, the module generates two output files. The first is saved in the Output in Word folder and includes: the original name of the sequence in the first line, the strand of reference in the second line, and finally, the analyzed sequence. A color code is used to highlight different features, with a legend provided after the sequence. At the end of the file, potentially relevant positions in the experimental sequence are listed under three distinct categories: 1—Insertions: positions where insertions were detected and removed to maintain alignment with the reference reading frame; 2—Methylated cytosines: positions identified as methylated; 3—Uncertain cytosines: positions for which the methylation status could not be determined. The second output file produced is a plain text summary of the performed analysis and serves as input for the subsequent Table module.
Table module
Once all the sequences under a given experimental condition have been analyzed, it is often desirable to display the data in both tabular and graphical formats. For this purpose, the Table module can generate a file with .csv extension and, if needed, an Excel file (.xlsx). Once the module is launched, a pop-up window appears in which the user can specify the information to be included in the table. The starting point is the Gene name button, i.e. the reference sequence to be used.
In both file types, and for each sequence, the forward and reverse strands are combined (if available). In the case of Excel files, the individual strands are hidden to improve readability. The generated tables contain information on the positions of cytosines, the presence or absence of CpGs, and the starting position of the reference sequence. For each sequence, methylation status is reported, including the number of methylated cytosines expressed as both an absolute value and a percentage (Fig. 2a, b).
Possible MethPy outputs. (a, b) The Excel file created through the Table module using the tutorial sequences: (a) detail of the file; (b) complete overview. (c–e) Demonstrative plots: (c) Methylation % of all the cytosines, with CpG sites in red and non-CpG sites in gray, (b) Methylation % of CpGs only, (c) Methylation % of non-CpGs only.
Plot module
The Plot module is used to generate graphs and was specifically designed to offer extensive customization options to the user. Once launched, a first window allows the user to input specifications. A drop-down menu is available to select the table containing the data to be used for generating the graphs. All other fields are optional. In addition to specifying the graph title, the user can also define resolution and file extension. The last two buttons open additional windows that provide options to customize the base range and errors bars, as well as to define custom chart colors.
Three examples of plots that can be generated using this module are shown in Fig. 2c–e, using the “tutorial” sequence.
Init module
The last module, Init, contains most of the functions and classes used by the other modules.
MethPy is freely available on the website of the Dept. of Experimental Medicine, Sapienza University of Rome. The package is released in different versions and can be run on Windows, iOs.
Discussion
Sequence analysis of PCR products obtained after bisulfite treatment to study the methylation profile of specific DNA regions was initially performed by manual comparison of Sanger sequencing results with the corresponding sequence deposited in GenBank12,13. As this type of analysis expanded, several software tools capable of automatically comparing bisulfite-treated and reference sequences were developed14,15,16. Although a large part of DNA methylation studies is now shifting toward genome-wide approaches22,23,24,25,26, bisulfite assay followed by targeted DNA Sanger sequencing allows measuring the methylation level of each cytosine in a given sequence, resulting in a higly accurated methylation profiling18,19,20,21,27,28,29. In this context, therefore, the use of bioinformatics tools to facilitate the readout of the obtained sequences is essential. However, the tools released so far to analyse bisulfite-PCR generated sequences do not allow assessment of the methylation status of cytosines outside of CpG dinucleotides. This so-called “non-CpG” (or CpH) methylation was historically considered of doubtful relevance in terms of both abundance and biological function3. In recent years, however, multiple reports have indicated widespread and functionally relevant non-CpG methylation in adult tissues, particularly associated with gene expression regulation5,9.
Our laboratory is actively engaged in the study of non-CpG methylation and has therefore encountered the lack of suitable bioinformatics tools to assist in the analysis of bisulfite-derived sequences. To address this methodological gap and make this process as automated as possible, the MethPy software was recently developed. This Python package was designed to simplify the analysis of this type of sequence and is freely available online. The different software modules were designed to produce both a summary table of the methylation profile and a graphical representation of the results. While MethPy performs most of the sequence analysis automatically, it still requires the user to indicate the type of error encountered in the sequence via a dedicated window.
In the present version, the alignment of the sequences is left “custom” due to the difficulties in using standard alignment algorithms on bisulfite converted sequences, without knowing if the sequence is a forward or a reverse (depending on the orientation of the amplicon at the T/A cloning step). The software is, however, open source and it will be possible in the future to implement such a feature.
It should be taken into account that common SNPs or single C to T base mutation could represent a confounding factor when analyzing DNA methylation. Although this could represent a minor issue in these targeted assays, it would be advisable to run and analyse an unmodified sequence before the bisulfite-modified samples.
MethPy has already been tested on several experimental sequences generated in our laboratory workflow and was used to analyse data included in a recently published paper21, as well as in other manuscripts currently in preparation. To our knowledge, no other tool allows straightforward analysis of non-CpG methylation in Sanger-derived bisulfite sequences. MethPy is not suitable for analising large-scale or high-throughput sequencing data but, on the other hand, softwares made to analyse such data are not suitable for the Sanger-derived sequences. MethPy is open-access and allows contribution of third parts to further improvement and development.
Due to the increasing interest in the role of non-CpG methylation and the continued widespread use of the bisulfite assay to study the methylation status of target DNA sequences, we suggest that MethPy could be a valuable tool for many epigeneticists, enabling rapid and accurate analysis of Sanger sequences generated after bisulfite treatment and PCR.
Methods
The MethPy package was developed through the Packaging: Python Project (online version).
MethPy was developed to detect the presence of non-CpG methylation in bisulfite-treated DNA sequences. It is structured as a modular package, with each module dedicated to a specific part of the analysis workflow. The modules used include os, sys, and pathlib for directory access and management; tkinter for generating pop-up windows; json for creating and handling JSON files; and Bio (from Biopython) for managing .ab1 files. Word file generation is handled using the docx module, while Excel file handling relies on openpyxl, xlsxwriter, and xlwings.
For data handling, pandas and numpy are used. Plotting is performed using pyplot (from matplotlib) and seaborn. The ctypes module is employed to manage DPI awareness on Windows systems.
Start module
The folder structure is generated using the Start module, which creates a set of directories starting from the working directory. All generated folders are utilized by the various MethPy modules and are described in the Results section.
Tutorial module
The Tutorial module generates example sequences to help users become familiar with the analysis workflow. It creates a folder named “Sequence tutorial” within the “Input” directory, where all the generated sequences are stored. A reference file named TutorialF.txt is saved in the “References” folder. The module then uses the “create_reverse” function to generate the reverse complement of the sequence, which is saved as TutorialR.txt.
Starting from the reference sequence, the module generates a set of sequences for analysis. These sequences simulate methylated and unmethylated cytosines treated with sodium bisulfite. Each generated sequence contains at least one nucleotide that differs from the reference, simulating potential sequencing errors. Additionally, the sequences are deliberately shortened and do not span the full reference length, mimicking real scenarios in which sequence reads may be truncated at the ends. Five forward and five reverse sequences are generated and saved in the “Input/Sequence tutorial” directory.
Ref module
The Ref module is used to save reference sequences. It displays a pop-up window where the user can enter the name and the nucleotide sequence of the reference. The module converts all characters to uppercase and generates the reverse complement. It also checks for the presence of non-standard nucleotide characters; these are preserved in the reverse sequence, and a list of them is printed as a warning. Before saving, the module verifies whether a file with the same name already exists. If so, it appends a number to the filename to avoid overwriting the existing file. Finally, ref.py saves both the forward and reverse sequences as text files in the “References” folder.
Check module
The Check module starts generating a text file and a Word file, setting all the necessary formatting parameters such as margins and font style. A pop-up window is then displayed, containing a drop-down menu to select the “Gene name”, which correspond to the names of all files saved in the “References” folder. Below this, there is a text box where the user must enter the “Path of the sequence” to be analyzed. To facilitate this process, a “Search” button on the left allows the user to browse folders to locate the target sequence. If any required information is missing or invalid, the module displays a warning message prompting the user to provide valid input. Otherwise, the workflow proceeds.
The module then opens both the forward and reverse reference sequences.
It then begins writing information to the Word file. In the first row, it includes the name of the sequence to be analyzed, hereafter referred to simply as the “sequence.”
If the file is an .ab1 format, the module extracts the nucleotide sequence from it; if it is a plain text file, it opens directly. Any line breaks within the sequence are removed. The module loads both the forward and reverse references and calculates the shortest common length between the reference and the sequence to define the comparison window. The module then searches for the starting position of the sequence alignment relative to the reference. It begins by analyzing a 10-nucleotide window, searching for an exact match between the sequence and the forward reference. If a match is found, a pop-up window is displayed. This window shows the label “Sequence”, followed by 65 nucleotides starting from the matched position. Below that, the direction of the reference (forward or reverse) is shown, along with 65-nucleotide matching region from the reference. A row of “X”s indicates exact matches or base differences explainable by bisulfite conversion. The interface also displays the total number of exact or explainable matches within the 65-nucleotide window. Three buttons are available at the bottom: “Yes” to confirm that the correct start has been found, “No” to continue the search, and “Cannot find the start” to stop the process if no valid alignment is detected. If the third option is selected, the module terminates and displays a warning message.
At this point, the main analysis begins. The module compares the sequence with the reference, annotating methylated cytosines and identifying conversions of the unmethylated ones. Each nucleotide is printed in the Word document. Methylated cytosines are highlighted in green, while converted cytosines are highlighted in yellow. When discrepancies not attributable to bisulfite conversion are detected, another pop-up window appears. This window displays 20 nucleotides of the sequence above 20 nucleotides from the reference. At the bottom, four buttons are available: three correspond to possible sequencing errors (deletion, insertion, or base change), and one is a “Stop” button, used when the sequence diverges excessively from the reference. In the case of a deletion, the missing nucleotide is inserted into the Word file with cyan highlighting – or red if the deleted base is a cytosine.
In the case of a base change, the differing base is highlighted in pink – or in blue if a cytosine has been replaced by another nucleotide. In the case of an insertion, the extra base is removed from the sequence to preserve alignment with the reference. However, its position is stored in a list, along with the positions of methylated cytosines and cytosines with unknown methylation status.
These lists are printed at the end of the Word document and saved in a text file. The text file includes the name of the sequence, the list of methylated cytosine positions, a divider labeled “doubtful_c_positions” followed by the positions of cytosines with unknown methylation status, the last analyzed position, and the strand orientation.
The Word and text files are saved, respectively, in the “Output in Word” and “Output in txt” folders, following the same relative path structure as in the “Input” folder.
Table
The Table module accesses the “Tables” folder inside the current working directory. A pop-up window is then generated, prompting the user to select the gene name, which – just as in the Check module – is selected from a drop-down menu containing all file names stored in the “References” folder. A text box is provided to enter the folder path, which can alternatively be selected via folder browsing; this should point to the “Output in txt” folder, which contains all text files generated during the analysis. Another text box is available for the “Reference start position,” indicating the position at which the reference sequence aligns with the deposited sequence. A checkbox is also present to indicate whether an Excel file should be generated. If any of the required information is missing or invalid, a warning message is displayed.
The module collects all input data and saves them. It then opens the reference sequence and generates a dataframe, where the column headers represent the positions of cytosines in the reference. The first row corresponds to their positions relative to the deposited sequence.
For each text file in the selected folder, the module reads the strand orientation, the last nucleotide analyzed, the positions of methylated cytosines, and those with unknown methylation status. It compiles a table in which unmethylated cytosines are marked as 0, methylated cytosines as 1, and all others—either unknown or not covered by the sequence—as 2. All data are converted to strings to improve dataframe readability.
The dataframe is then sorted so that entries with the same sequence name are grouped together, with the forward (F) strand preceding the reverse (R) strand. A combined row, labeled “F-R”, merges the information from both strands. If only one strand is available, the F-R row is a copy of that strand.
All CpG cytosines are identified within the reference sequence, and their positions are stored in a list, along with those of the “CCTCC” motif. A dedicated a row is added to the dataframe to indicate CpG positions.
At the end of the dataframe, a row summarizing the total number and percentage of methylated cytosines is added. This dataframe is saved as a CSV file in the “Table” folder. If the Excel checkbox is selected, an .xlsx file is also generated. This Excel file contains a legend in the top-left corner and displays only the F-R rows, with the other rows hidden for clarity. In the Excel version, CpG cytosines are written in red and CCTCC cytosines in green. Cell colors follow the legend: black for methylated, white for unmethylated, and blue for unknown status.
All calculations in the Excel file are performed using Excel formulas, allowing any manual changes to the table to automatically update the totals at the bottom.
If the module is unable to save the Excel file in the “Table” folder, a warning message is displayed.
Plot module
The Plot module generates a pop-up window in which the table file must be selected via a drop-down menu; both CSV file formats are supported. The user can also enter the gene name and experimental condition, which will be included in the graph. The default resolution is set to 200 DPI, but this can be modified by entering a different value in the corresponding text box. A drop-down menu allows the user to select the output file format; available options are TIFF, JPEG, and PDF.
Two additional options in the window are “Custom settings for base range and errors” and “Custom chart colors”. Clicking the “Click me!” button of the first option opens a new window, where the user can define the range of bases to be plotted. This feature is useful for excluding regions such as primer-binding sites, which may introduce bias in the methylation analysis. Error bars can also be added using one of two options: “Same error value for all the bases” and “Custom errors for each base; enter a comma-separated list of numbers”. The first applies a uniform error to each cytosine, while the second allows the user to input a comma-separated list of individual error values. If the number of errors provided is insufficient to cover all cytosines, the remaining positions will be assigned an error value of 0. An optional checkbox allows users to add an error cap.
Selecting the “Click me!” button of “Custom chart colors” opens a second window dedicated to graphical customization. In this window, users can modify the colors of the CpG and non-CpG bars using the color picker window and choose whether to display the average value as a line over the bars.
By default, CpG bars are colored red, non-CpG bars are gray, no error cap is applied, and no percentage number is plotted over the bar. Once these settings are modified, they are saved locally in the package’s installation directory, so that subsequent uses of the module will load the last used configuration.
The only mandatory input is the table file; all other parameters are either read from a JSON configuration file stored in the package directory or assigned default values (e.g., DPI = 200).
Each time the user selects new settings, the module updates the JSON file with the current configuration.
The module checks whether the input file is a CSV. If so, it extracts the percentage of methylation, the list of cytosine positions, and CpG-related information. If the file is an Excel spreadsheet, it retrieves the same information directly from the table.
If a base range is specified, only cytosines within that range are considered. If a list of error values is provided, it is added to the dataframe. If the list is shorter than the number of cytosines, the module displays a warning: “An error was not provided for each base; for the last positions, 0 was used as the error.”
Positive and negative errors are assumed to be symmetric. The module ensures that the upper bound of each error-adjusted value does not exceed 100, and that the lower bound does not fall below 0. Cytosines with values of exactly 0 or 100 do not receive error bars.
A list of all cytosine positions, as well as separate lists for CpG and non-CpG positions, is generated.
Total methylation is then plotted. The first graph includes all cytosine positions, with CpG values highlighted in a different color, as specified by the user. If selected, the percentage of methylation is also plotted above each bar.
Separate plots are also generated for CpG and non-CpG methylation, following the same layout and graphical settings.
Data availability
The software is free. The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
References
-
Tollefsbol, T. Epigenetics in Human Disease 1–1308 (Springer, 2023).
-
Mattei, A. L., Bailly, N. & Meissner, A. DNA methylation: a historical perspective. Trends Genet. 38, 676–707 (2022).
-
Ramsahoye, B. H. et al. Non-CpG methylation is prevalent in embryonic stem cells and may be mediated by DNA methyltransferase 3a. Proc. Natl. Acad. Sci. U. S. A. 97, 5237–5242 (2000).
-
Titcombe, P. et al. Human non-CpG methylation patterns display both tissue-specific and inter-individual differences suggestive of underlying function. Epigenetics 17, 653–664 (2022).
-
Guo, J. U. et al. Distribution, recognition and regulation of non-CpG methylation in the adult mammalian brain. Nat. Neurosci. 17, 215–222 (2014).
-
Jang, H.S. et al. CpG and Non-CpG methylation in epigenetic gene regulation and brain function. Genes (Basel) 8, E14 (2017).
-
Fuso, A., Ferraguti, G., Scarpa, S., Ferrer, I. & Lucarelli, M. Disclosing bias in bisulfite assay: MethPrimers underestimate high DNA methylation. PLoS ONE 10, e0118318 (2015).
-
Pietrzak, M., Rempala, G. A., Nelson, P. T. & Hetman, M. Non-random distribution of methyl-CpG sites and non-CpG methylation in the human rDNA promoter identified by next generation bisulfite sequencing. Gene 585, 35–43 (2016).
-
de Mendoza, A. et al. The emergence of the brain non-CpG methylation system in vertebrates. Nat. Ecol. Evol. 5, 369–378 (2021).
-
Scarano, C. et al. The third-generation sequencing challenge: novel insights for the omic sciences. Biomolecules 14, 568 (2024).
-
Chenarani, N. et al. Bioinformatic tools for DNA methylation and histone modification: a survey. Genomic 113, 1098–1113 (2021).
-
Clark, S. J., Statham, A., Stirzaker, C., Molloy, P. L. & Frommer, M. DNA methylation: bisulfite modification and analysis. Nat. Protoc. 1, 2353–2364 (2006).
-
Oakeley, E. J. DNA methylation analysis: a review of current methodologies. Pharmacol. Ther. 84, 389–400 (1999).
-
Zhou, Q., Lim, J. Q., Sung, W. K. & Li, G. An integrated package for bisulfite DNA methylation data analysis with Indel-sensitive mapping. BMC Bioinform. 20, 47 (2019).
-
Bock, C. et al. BiQ Analyzer: visualization and quality control for DNA methylation data from bisulfite sequencing. Bioinformatics 21, 4067–4068 (2005).
-
Grunau, C., Schattevoy, R., Mache, N. & Rosenthal, A. MethTools–a toolbox to visualize and analyze DNA methylation data. Nucleic Acids Res. 28, 1053–1058 (2000).
-
Fuso, A. et al. Early demethylation of non-CpG, CpC-rich, elements in the myogenin 5’-flanking region: a priming effect on the spreading of active demethylation. Cell Cycle 9, 3965–3976 (2010).
-
Nicolia, V. et al. DNA methylation profiles of selected pro-inflammatory cytokines in alzheimer disease. J. Neuropathol. Exp. Neurol. 76, 27–31 (2017).
-
Monti, N. et al. CpG and non-CpG Presenilin1 methylation pattern in course of neurodevelopment and neurodegeneration is associated with gene expression in human and murine brain. Epigenetics 15, 781–799 (2020).
-
Raia, T. et al. Perinatal S-adenosylmethionine supplementation represses PSEN1 expression by the cellular epigenetic memory of CpG and non-CpG methylation in adult TgCRD8 mice. Int. J. Mol. Sci. 24, 11675 (2023).
-
Raia, T. et al. One-carbon metabolism modulates miR-29a-DNA methylation crosstalk in Alzheimer’s disease. Alzheimers Dement 21, e70703 (2025).
-
Perzel Mandell, K. A. et al. Genome-wide sequencing-based identification of methylation quantitative trait loci and their role in schizophrenia risk. Nat. Commun. 12, 5251 (2021).
-
Nagamatsu, S.T. et al. CpH methylome analysis in human cortical neurons identifies novel gene pathways and drug targets for opioid use disorder. Front. Psychiatry 13, 1078894 (2023).
-
Walayat, A. et al. Maternal e-cigarette exposure alters DNA methylome, site-specific CpG and CH methylation, and transcriptomic signatures in the neonatal brain. Sci. Rep. 14, 24263 (2024).
-
Schaffner, S. L. et al. Distinct impacts of alpha-synuclein overexpression on the hippocampal epigenome of mice in standard and enriched environments. Neurobiol. Dis. 186, 106274 (2023).
-
Ramasamy, D., Rao, A. K. D. M., Rajkumar, T. & Mani, S. Experimental and computational approaches for non-CpG methylation analysis. Epigenomes 6, 24 (2022).
-
Signal, B., Pérez Suárez, T. G., Taberlay, P. C. & Woodhouse, A. Cellular specificity is key to deciphering epigenetic changes underlying Alzheimer’s disease. Neurobiol. Dis. 186, 106284 (2023).
-
Boltz, V. F. et al. CpG methylation profiles of HIV-1 pro-viral DNA in individuals on ART. Viruses 13, 799 (2021).
-
Yang, A. et al. FZD7, regulated by non-CpG methylation, plays an important role in immature porcine sertoli cell proliferation. Int. J. Mol. Sci. 24, 6179 (2023).
Acknowledgements
The authors are grateful to Prof. Allegra Via for the critical reading of the manuscript and Dr. Emiliano Valente for the precious suggestions on Phyton programming.
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Roiati, M., Borges, L.D.F., Cattani, A. et al. MethPy: a new software for analyzing non-CpG methylation after bisulfite assay and Sanger sequencing. Sci Rep 15, 42068 (2025). https://doi.org/10.1038/s41598-025-26089-8
-
Received:
-
Accepted:
-
Published:
-
Version of record:
-
DOI: https://doi.org/10.1038/s41598-025-26089-8


