Microbiome Data Analysis and Visualization Pipeline using QIIME and Phyloseq

gene_x 0 like s 475 view s

Tags: metagenomics, 16S

  1. Setup environment: This step creates a Conda environment, activates it, and installs the QIIME software package from the Bioconda channel. This ensures a clean and isolated environment for the analysis.

    conda create -n qiime1
    conda activate qiime1
    conda install -c bioconda qiime
    
  2. Run FastQC for sequence quality inspection: FastQC is run on the raw data files to generate quality reports, which allow researchers to evaluate the quality of their sequence data before proceeding with downstream analyses.

    mkdir fastqc_out
    fastqc -t 4 raw_data/* -o fastqc_out/
    
  3. Rename files: This step renames the raw data files to have more meaningful names, which can be useful when tracking samples throughout the analysis.

    cd raw_data
    for file in *.fastq.gz; do mv $file $(echo $file | cut -d'_' -f1 | cut -d'-' -f1-1)_$(echo $file | cut -d'_' -f4).fastq.gz; done
    cd ..
    
  4. Trim paired-end reads: Trimmomatic is used to remove low-quality bases and adapter sequences from the paired-end reads. This step ensures that only high-quality reads are used in downstream analyses.

    mkdir trim_data trimmed_unpaired
    cd raw_data
    for file in 3-9141vag_R1.fastq.gz 16-9148stuhl_R1.fastq.gz 4-9140vag_R1.fastq.gz 15-9136stuhl_R1.fastq.gz 2-9133vag_R1.fastq.gz 11-9133stuhl_R1.fastq.gz 7-9148vag_R1.fastq.gz 10-9135stuhl_R1.fastq.gz 12-9141stuhl_R1.fastq.gz 5-9161vag_R1.fastq.gz ; do java -jar /home/jhuang/Tools/Trimmomatic-0.36/trimmomatic-0.36.jar PE -threads 16 $file ${file/_R1/_R2} ../trim_data/$file ../trimmed_unpaired/$file ../trim_data/${file/_R1/_R2} ../trimmed_unpaired/${file/_R1/_R2} ILLUMINACLIP:/home/jhuang/Tools/Trimmomatic-0.36/adapters/TruSeq3-PE-2.fa:2:30:10:8:TRUE LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36 AVGQUAL:20; done 2> trimmomatic_pe.log
    
  5. Stitch paired-end reads with PANDAseq: PANDAseq is used to merge the trimmed paired-end reads into a single continuous sequence, which simplifies downstream analyses.

    mkdir pandaseq.out
    for file in trim_data/*_R1.fastq.gz; do pandaseq -f ${file} -r ${file/_R1.fastq.gz/_R2.fastq.gz} -l 300 -p CCTACGGGNGGCWGCAG -q GACTACHVGGGTATCTAATCC  -w pandaseq.out/$(echo $file | cut -d'/' -f2 | cut -d'_' -f1-3)_merged.fasta >> LOG_pandaseq; done
    
  6. Create QIIME mapping files: This step validates the mapping file that contains the sample metadata, ensuring the correct format for QIIME.

    Download the file map.txt

    validate_mapping_file.py -m map.txt
    
  7. Combine files into a labeled file: The add_qiime_labels.py script is used to combine the merged sequence files from PANDAseq into a single file with sample labels, making it suitable for further QIIME analysis.

    add_qiime_labels.py -i pandaseq.out -m map_corrected.txt -c FileInput -o combined_fasta
    
  8. Remove chimeric sequences using USEARCH: Chimeric sequences, which are artificially created during PCR, are identified and removed using USEARCH. This ensures that only genuine sequences are used in downstream analyses.

    cd combined_fasta
    pyfasta split -n 100 combined_seqs.fna
    for i in {000..099}; do echo "identify_chimeric_seqs.py -i combined_fasta/combined_seqs.fna.${i} -m usearch61 -o usearch_checked_combined.${i}/ -r ~/REFs/gg_97_otus_4feb2011_fw_rc.fasta --threads=14;" >> uchime_commands.sh; done
    mv uchime_commands.sh ..
    ./uchime_commands.sh
    cat usearch_checked_combined.000/chimeras.txt usearch_checked_combined.001/chimeras.txt usearch_checked_combined.002/chimeras.txt usearch_checked_combined.003/chimeras.txt usearch_checked_combined.004/chimeras.txt usearch_checked_combined.005/chimeras.txt usearch_checked_combined.006/chimeras.txt usearch_checked_combined.007/chimeras.txt usearch_checked_combined.008/chimeras.txt usearch_checked_combined.009/chimeras.txt usearch_checked_combined.010/chimeras.txt usearch_checked_combined.011/chimeras.txt usearch_checked_combined.012/chimeras.txt usearch_checked_combined.013/chimeras.txt usearch_checked_combined.014/chimeras.txt usearch_checked_combined.015/chimeras.txt usearch_checked_combined.016/chimeras.txt usearch_checked_combined.017/chimeras.txt usearch_checked_combined.018/chimeras.txt usearch_checked_combined.019/chimeras.txt usearch_checked_combined.020/chimeras.txt usearch_checked_combined.021/chimeras.txt usearch_checked_combined.022/chimeras.txt usearch_checked_combined.023/chimeras.txt usearch_checked_combined.024/chimeras.txt usearch_checked_combined.025/chimeras.txt usearch_checked_combined.026/chimeras.txt usearch_checked_combined.027/chimeras.txt usearch_checked_combined.028/chimeras.txt usearch_checked_combined.029/chimeras.txt usearch_checked_combined.030/chimeras.txt usearch_checked_combined.031/chimeras.txt usearch_checked_combined.032/chimeras.txt usearch_checked_combined.033/chimeras.txt usearch_checked_combined.034/chimeras.txt usearch_checked_combined.035/chimeras.txt usearch_checked_combined.036/chimeras.txt usearch_checked_combined.037/chimeras.txt usearch_checked_combined.038/chimeras.txt usearch_checked_combined.039/chimeras.txt usearch_checked_combined.040/chimeras.txt usearch_checked_combined.041/chimeras.txt usearch_checked_combined.042/chimeras.txt usearch_checked_combined.043/chimeras.txt usearch_checked_combined.044/chimeras.txt usearch_checked_combined.045/chimeras.txt usearch_checked_combined.046/chimeras.txt usearch_checked_combined.047/chimeras.txt usearch_checked_combined.048/chimeras.txt usearch_checked_combined.049/chimeras.txt usearch_checked_combined.050/chimeras.txt usearch_checked_combined.051/chimeras.txt usearch_checked_combined.052/chimeras.txt usearch_checked_combined.053/chimeras.txt usearch_checked_combined.054/chimeras.txt usearch_checked_combined.055/chimeras.txt usearch_checked_combined.056/chimeras.txt usearch_checked_combined.057/chimeras.txt usearch_checked_combined.058/chimeras.txt usearch_checked_combined.059/chimeras.txt usearch_checked_combined.060/chimeras.txt usearch_checked_combined.061/chimeras.txt usearch_checked_combined.062/chimeras.txt usearch_checked_combined.063/chimeras.txt usearch_checked_combined.064/chimeras.txt usearch_checked_combined.065/chimeras.txt usearch_checked_combined.066/chimeras.txt usearch_checked_combined.067/chimeras.txt usearch_checked_combined.068/chimeras.txt usearch_checked_combined.069/chimeras.txt usearch_checked_combined.070/chimeras.txt usearch_checked_combined.071/chimeras.txt usearch_checked_combined.072/chimeras.txt usearch_checked_combined.073/chimeras.txt usearch_checked_combined.074/chimeras.txt usearch_checked_combined.075/chimeras.txt usearch_checked_combined.076/chimeras.txt usearch_checked_combined.077/chimeras.txt usearch_checked_combined.078/chimeras.txt usearch_checked_combined.079/chimeras.txt usearch_checked_combined.080/chimeras.txt usearch_checked_combined.081/chimeras.txt usearch_checked_combined.082/chimeras.txt usearch_checked_combined.083/chimeras.txt usearch_checked_combined.084/chimeras.txt usearch_checked_combined.085/chimeras.txt usearch_checked_combined.086/chimeras.txt usearch_checked_combined.087/chimeras.txt usearch_checked_combined.088/chimeras.txt usearch_checked_combined.089/chimeras.txt usearch_checked_combined.090/chimeras.txt usearch_checked_combined.091/chimeras.txt usearch_checked_combined.092/chimeras.txt usearch_checked_combined.093/chimeras.txt usearch_checked_combined.094/chimeras.txt usearch_checked_combined.095/chimeras.txt usearch_checked_combined.096/chimeras.txt usearch_checked_combined.097/chimeras.txt usearch_checked_combined.098/chimeras.txt usearch_checked_combined.099/chimeras.txt > chimeras.txt
    filter_fasta.py -f combined_fasta/combined_seqs.fna -o combined_fasta/combined_nonchimera_seqs.fna -s chimeras.txt -n;
    #rm -rf usearch_checked_combined.0*
    mkdir usearch_checked_combined_DEL
    mv usearch_checked_combined.0* usearch_checked_combined_DEL
    
  9. Create OTU picking parameter file and run the QIIME open reference picking pipeline: This step creates a parameter file for clustering the sequences into operational taxonomic units (OTUs) and then runs the QIIME pipeline to pick OTUs using both reference-based and de novo methods.

    echo "pick_otus:similarity 0.97" > clustering_params.txt
    echo "assign_taxonomy:similarity 0.97" >> clustering_params.txt
    echo "parallel_align_seqs_pynast:template_fp /home/jhuang/REFs/SILVA_132_QIIME_release/core_alignment/80_core_alignment.fna" >> clustering_params.txt
    echo "assign_taxonomy:reference_seqs_fp /home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna" >> clustering_params.txt
    echo "assign_taxonomy:id_to_taxonomy_fp /home/jhuang/REFs/SILVA_132_QIIME_release/taxonomy/16S_only/99/consensus_taxonomy_7_levels.txt" >> clustering_params.txt
    echo "alpha_diversity:metrics chao1,observed_otus,shannon,PD_whole_tree" >> clustering_params.txt
    #with usearch61 for reference picking and usearch61_ref for de novo OTU picking
    pick_open_reference_otus.py -r/home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna -i combined_fasta/combined_nonchimera_seqs.fna -o clustering/ -p clustering_params.txt --parallel
    pick_open_reference_otus.py -r/home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna -i combined_fasta_swab/combined_nonchimera_seqs.fna -o clustering_swab/ -p clustering_params.txt --parallel
    pick_open_reference_otus.py -r/home/jhuang/REFs/SILVA_132_QIIME_release/rep_set/rep_set_16S_only/99/silva_132_99_16S.fna -i combined_fasta_stool/combined_nonchimera_seqs.fna -o clustering_stool/ -p clustering_params.txt --parallel
    
  10. Core diversity analyses: The core_diversity_analyses.py script is used to perform multiple diversity analyses, including alpha and beta diversity, on the OTU table generated in the previous step. This provides an overview of the diversity within and between samples.

    core_diversity_analyses.py -o./core_diversity_e4753 -i./clustering/otu_table_mc2_w_tax_no_pynast_failures.biom -m./map_corrected.txt -t./clustering/rep_set.tre -e4753 -p./clustering_params.txt
    core_diversity_analyses.py -o./core_diversity_e4753_swab -i./clustering_swab/otu_table_mc2_w_tax_no_pynast_failures.biom -m./map_corrected_swab.txt -t./clustering_swab/rep_set.tre -e4753 -p./clustering_params.txt
    core_diversity_analyses.py -o./core_diversity_e4753_stool -i./clustering_stool/otu_table_mc2_w_tax_no_pynast_failures.biom -m./map_corrected_stool.txt -t./clustering_stool/rep_set.tre -e4753 -p./clustering_params.txt
    
  11. Supplementary steps for core diversity analyses: This step includes additional analyses, such as collapsing samples by metadata categories, sorting OTU tables, and generating taxonomic summaries. These supplementary analyses can provide additional insights into the dataset, such as differences in microbial community composition between sample types.

    gunzip ./core_diversity_e42369/table_mc42369.biom.gz
    mkdir ./core_diversity_e42369/taxa_plots_Group
    collapse_samples.py -m ./map_corrected.txt -b./core_diversity_e42369/table_mc42369.biom --output_biom_fp ./core_diversity_e42369/taxa_plots_Group/Group_otu_table.biom --output_mapping_fp ./core_diversity_e42369/taxa_plots_Group/Group_map_corrected.txt --collapse_fields "Group"
    gzip ./core_diversity_e42369/table_mc42369.biom
    
    sort_otu_table.py -i./core_diversity_e42369/taxa_plots_Group/Group_otu_table.biom -o./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted.biom
    summarize_taxa.py -i./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted.biom -o./core_diversity_e42369/taxa_plots_Group/
    
    plot_taxa_summary.py -i./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted_L2.txt,./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted_L3.txt,./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted_L4.txt,./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted_L5.txt,./core_diversity_e42369/taxa_plots_Group/Group_otu_table_sorted_L6.txt -o./core_diversity_e42369/taxa_plots_Group/taxa_summary_plots/
    
    ##alpha diversity##
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/PD_whole_tree.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_PD_whole_tree -n 9999
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/chao1.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_chao1 -n 9999
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/observed_otus.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_observed_otus -n 9999
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/shannon.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_shannon -n 9999
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/PD_whole_tree.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_PD_whole_tree_tt -t parametric
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/chao1.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_chao1_tt -t parametric
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/observed_otus.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_observed_otus_tt -t parametric
    compare_alpha_diversity.py -i./core_diversity_e42369/arare_max42369/alpha_div_collated/shannon.txt -m ./map_corrected.txt -c "Group" -o./core_diversity_e42369/arare_max42369_Group/compare_shannon_tt -t parametric
    
    ##beta diversity statistics##
    make_distance_boxplots.py -d./core_diversity_e42369/bdiv_even42369/weighted_unifrac_dm.txt -f"Group" -o./core_diversity_e42369/bdiv_even42369_Group/weighted_unifrac_boxplots/ -m ./map_corrected.txt --save_raw_data -n 9999
    make_distance_boxplots.py -d./core_diversity_e42369/bdiv_even42369/unweighted_unifrac_dm.txt -f"Group" -o./core_diversity_e42369/bdiv_even42369_Group/unweighted_unifrac_boxplots/ -m ./map_corrected.txt --save_raw_data -n 9999
    #make_distance_boxplots.py -d./core_diversity_e42369/bdiv_even42369/unweighted_unifrac_dm.txt -f"Group" -o./core_diversity_e42369/bdiv_even42369_Group/unweighted_unifrac_boxplots/ -m ./map_corrected.txt -g png
    compare_categories.py --method adonis -i./core_diversity_e42369/bdiv_even42369/unweighted_unifrac_dm.txt -m./map_corrected.txt -c "Group" -o./core_diversity_e42369/bdiv_even42369_Group/adonis_out -n 9999
    compare_categories.py --method anosim -i./core_diversity_e42369/bdiv_even42369/weighted_unifrac_dm.txt -m./map_corrected.txt -c "Group" -o./core_diversity_e42369/bdiv_even42369_Group/weighted_anosim_out -n 9999
    compare_categories.py --method anosim -i./core_diversity_e42369/bdiv_even42369/unweighted_unifrac_dm.txt -m./map_corrected.txt -c "Group" -o./core_diversity_e42369/bdiv_even42369_Group/unweighted_anosim_out -n 9999
    
    ##using even.biom file to generate group significance##
    gunzip ./core_diversity_e42369/table_even42369.biom.gz
    group_significance.py -i./core_diversity_e42369/table_even42369.biom -m./map_corrected.txt -c "Group" -s kruskal_wallis -o./core_diversity_e42369/group_significance_Group_kw_ocs.txt --biom_samples_are_superset --print_non_overlap
    group_significance.py -i./core_diversity_e42369/table_even42369.biom -m./map_corrected.txt -c "Group" -s g_test -o./core_diversity_e42369/group_significance_Group_gtest_ocs.txt
    gzip ./core_diversity_e42369/table_even42369.biom
    
  12. Create an index.html to organize all the output above: In this step, an index.html file is created to organize and display all the output generated from previous steps. This file serves as a summary and a convenient way to access all the visualizations and results in one place.

    <html>
    <head><title>QIIME results</title></head>
    <body>
    <a href="http://www.qiime.org" target="_blank"><img src="http://qiime.org/_static/wordpressheader.png" alt="www.qiime.org""/></a><p>
    <table border=1>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Run summary data</td></tr>
    <tr><td>Master run log</td><td> <a href="./log_20230403124456.txt" target="_blank">log_20230403124456.txt</a></td></tr>
    <tr><td>BIOM table statistics</td><td> <a href="./biom_table_summary.txt" target="_blank">biom_table_summary.txt</a></td></tr>
    <tr><td>Filtered BIOM table (minimum sequence count: 42369)</td><td> <a href="./table_mc42369.biom.gz" target="_blank">table_mc42369.biom.gz</a></td></tr>
    <tr><td>Rarified BIOM table (sampling depth: 42369)</td><td> <a href="./table_even42369.biom.gz" target="_blank">table_even42369.biom.gz</a></td></tr>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Taxonomic summary results</td></tr>
    <tr><td>Taxa summary bar plots</td><td> <a href="./taxa_plots/taxa_summary_plots/bar_charts.html" target="_blank">bar_charts.html</a></td></tr>
    <tr><td>Taxa summary area plots</td><td> <a href="./taxa_plots/taxa_summary_plots/area_charts.html" target="_blank">area_charts.html</a></td></tr>
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Taxonomic summary results (by Group)</td></tr>
    <tr><td>Taxa summary bar plots</td><td> <a href="./taxa_plots_Group/taxa_summary_plots/bar_charts.html" target="_blank">bar_charts.html</a></td></tr>
    <tr><td>Taxa summary area plots</td><td> <a href="./taxa_plots_Group/taxa_summary_plots/area_charts.html" target="_blank">area_charts.html</a></td></tr>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Alpha diversity results</td></tr>
    <tr><td>Alpha rarefaction plots</td><td> <a href="./arare_max42369/alpha_rarefaction_plots/rarefaction_plots.html" target="_blank">rarefaction_plots.html</a></td></tr>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Beta diversity results (even sampling: 42369)</td></tr>
    <tr><td>PCoA plot (weighted_unifrac)</td><td> <a href="./bdiv_even42369/weighted_unifrac_emperor_pcoa_plot/index.html" target="_blank">index.html</a></td></tr>
    <tr><td>Distance matrix (weighted_unifrac)</td><td> <a href="./bdiv_even42369/weighted_unifrac_dm.txt" target="_blank">weighted_unifrac_dm.txt</a></td></tr>
    <tr><td>Principal coordinate matrix (weighted_unifrac)</td><td> <a href="./bdiv_even42369/weighted_unifrac_pc.txt" target="_blank">weighted_unifrac_pc.txt</a></td></tr>
    <tr><td>PCoA plot (unweighted_unifrac)</td><td> <a href="./bdiv_even42369/unweighted_unifrac_emperor_pcoa_plot/index.html" target="_blank">index.html</a></td></tr>
    <tr><td>Distance matrix (unweighted_unifrac)</td><td> <a href="./bdiv_even42369/unweighted_unifrac_dm.txt" target="_blank">unweighted_unifrac_dm.txt</a></td></tr>
    <tr><td>Principal coordinate matrix (unweighted_unifrac)</td><td> <a href="./bdiv_even42369/unweighted_unifrac_pc.txt" target="_blank">unweighted_unifrac_pc.txt</a></td></tr>
    </table>
    
    <br>
    <table border=1 bgcolor="#C3FDB8">
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Alpha diversity boxplots and statistics (by Group)</td></tr>
    <tr><td>Alpha diversity boxplots (PD_whole_tree)</td><td> <a href="./arare_max42369_Group/compare_PD_whole_tree/Group_boxplots.pdf" target="_blank">Group_boxplots.pdf</a></td></tr>
    <tr><td>Alpha diversity boxplots (chao1)</td><td> <a href="./arare_max42369_Group/compare_chao1_tt/Group_boxplots.pdf" target="_blank">Group_boxplots.pdf</a></td></tr>
    <tr><td>Alpha diversity boxplots (observed_otus)</td><td> <a href="./arare_max42369_Group/compare_observed_otus/Group_boxplots.pdf" target="_blank">Group_boxplots.pdf</a></td></tr>
    <tr><td>Alpha diversity boxplots (Shannon diversity index)</td><td> <a href="./arare_max42369_Group/compare_shannon_tt/Group_boxplots.pdf" target="_blank">Group_boxplots.pdf</a></td></tr>
    <tr><td>Alpha diversity statistics (PD_whole_tree)</td><td> <a href="./arare_max42369_Group/compare_PD_whole_tree/Group_stats.txt" target="_blank">Group_stats.txt</a></td></tr>
    <tr><td>Alpha diversity statistics (chao1)</td><td> <a href="./arare_max42369_Group/compare_chao1_tt/Group_stats.txt" target="_blank">Group_stats.txt</a></td></tr>
    <tr><td>Alpha diversity statistics (observed_otus)</td><td> <a href="./arare_max42369_Group/compare_observed_otus/Group_stats.txt" target="_blank">Group_stats.txt</a></td></tr>
    <tr><td>Alpha diversity statistics (Shannon diversity index)</td><td> <a href="./arare_max42369_Group/compare_shannon_tt/Group_stats.txt" target="_blank">Group_stats.txt</a></td></tr>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Beta diversity boxplots and statistics (by Group)</td></tr>
    <tr><td>Distance boxplots (weighted_unifrac)</td><td> <a href="./bdiv_even42369_Group/weighted_unifrac_boxplots/Group_Distances.pdf" target="_blank">Group_Distances.pdf</a></td></tr>
    <tr><td>Distance boxplots (unweighted_unifrac)</td><td> <a href="./bdiv_even42369_Group/unweighted_unifrac_boxplots/Group_Distances.pdf" target="_blank">Group_Distances.pdf</a></td></tr>
    <tr><td>Distance boxplots statistics (weighted_unifrac)</td><td> <a href="./bdiv_even42369_Group/weighted_unifrac_boxplots/Group_Stats.txt" target="_blank">Group_Stats.txt</a></td></tr>
    <tr><td>Distance boxplots statistics (unweighted_unifrac)</td><td> <a href="./bdiv_even42369_Group/unweighted_unifrac_boxplots/Group_Stats.txt" target="_blank">Group_Stats.txt</a></td></tr>
    
    <tr colspan=2 align=center bgcolor=#e8e8e8><td colspan=2 align=center>Beta diversity statistics (by Group)</td></tr>
    <tr><td>Beta diversity statistics (weighted)</td><td> <a href="./bdiv_even42369_Group/weighted_anosim_out/anosim_results.txt" target="_blank">anosim_results.txt</a></td></tr>
    <tr><td>Beta diversity statistics (unweighted)</td><td> <a href="./bdiv_even42369_Group/unweighted_anosim_out/anosim_results.txt" target="_blank">anosim_results.txt</a></td></tr>
    </table>
    
    </body></html>
    
  13. Run Phyloseq.Rmd to get Phyloseq.html (under qiime1-env): In this step, the Phyloseq.Rmd file is used to generate a Phyloseq.html file. Phyloseq is an R package that provides tools for analyzing and visualizing microbiome data. The Phyloseq.Rmd file contains R code and markdown text to create a report that includes data analysis, visualizations, and narrative text.

    • gunzip table_even42369.biom.gz
    • fitting Phyloseq.Rmd to current data by change points FITTING[1-4]: This is a comment line, indicating that you may need to adjust the Phyloseq.Rmd file to fit your current data by changing points labeled as FITTING[1-4].

      • FITTING1:

         tax_table(ps.ng.tax_most_)[1,"Domain"] <- str_split(tax_table(ps.ng.tax_most_)[1,"Domain"], "__")[[1]][2]
         ... ...
         tax_table(ps.ng.tax_most_)[167,"Species"] <- str_split(tax_table(ps.ng.tax_most_)[167,"Species"], "__")[[1]][2]
        
      • FITTING2: CONSOLE:

        alpha_diversity.py -i table_even42369.biom --metrics chao1,observed_otus,shannon,PD_whole_tree -o adiv_even.txt -t ../clustering/rep_set.tre
        
      • FITTING3:

        mkdir figures
        
      • FITTING4 (optional): if occuring "Computation failed in stat_signif():not enough 'y' observations", it means: the patient H47 contains only one sample, it should be removed for the statistical p-values calculations using div.df2 <- div.df2[-c(3), ]

      • FITTING5: correct the id of the group members, see FITTING6

      • FITTING6: regulate the bar height if it has replicates.
    • R -e "rmarkdown::render('Phyloseq.Rmd',output_file='Phyloseq.html')": This command runs the R script and renders the Phyloseq.Rmd file, generating the Phyloseq.html output file. The rmarkdown::render() function takes the input file (Phyloseq.Rmd) and creates the output file (Phyloseq.html) containing the report with analysis, visualizations, and text. Download the file Phyloseq.Rmd

like unlike

点赞本文的读者

还没有人对此文章表态


本文有评论

没有评论

看文章,发评论,不要沉默


© 2023 XGenes.com Impressum