Bubble plot for 1457∆atlE vs 1457-M10 vs 1457 vs mock

gene_x 0 like s 592 view s

Tags: plot, R, RNA-seq

  1. R code for bubbleplot

    library(ggplot2)
    library(dplyr)
    library(readxl)
    
    # Assuming you have already read the data with read_excel
    mydat <- read_excel("Pathway_KEGG_1457_vs_mock_Top10.xlsx")
    
    # Custom function to convert fraction to decimal
    convert_fraction_to_decimal <- function(fraction) {
        parts <- strsplit(as.character(fraction), "/")[[1]]
        as.numeric(parts[1]) / as.numeric(parts[2])
    }
    
    mydat$GeneRatio <- sapply(mydat$Ratio, convert_fraction_to_decimal)
    mydat$Description <- factor(mydat$Description, levels = unique(mydat$Description))
    mydat$Category <- factor(mydat$Category, levels=c("Up-regulated","Down-regulated"))
    
    description_order <- rev(c("TNF signaling pathway","Legionellosis","Cytokine-cytokine receptor interaction","Protein processing in endoplasmic reticulum","Toxoplasmosis","Fluid shear stress and atherosclerosis","Pathways in cancer","JAK-STAT signaling pathway","IL-17 signaling pathway","Influenza A","Transcriptional misregulation in cancer","Glycine serine and threonine metabolism","Antifolate resistance","Base excision repair","Metabolic pathways","Acute myeloid leukemia","Homologous recombination","Fanconi anemia pathway","Primary immunodeficiency","MAPK signaling pathway"))
    mydat$Description <- factor(mydat$Description, levels = description_order)
    
    # Set the size for axis labels larger than the axis text
    axis_label_size <- 24
    
    # Now, create the plot        
    png("bubble_plot.png", width = 1000, height = 800)
    ggplot(mydat, aes(x = GeneRatio, y = Description)) +
      geom_point(aes(color = Category, size = Count, alpha = abs(log10(FDR)))) +
      scale_color_manual(values = c("Up-regulated" = "red", "Down-regulated" = "blue")) +
      scale_size_continuous(range = c(4, 10)) +
      labs(x = "GeneRatio", y = "Pathway name", color="Category", size="Count", alpha="-log10(FDR)") +
      theme(
        axis.text.x = element_text(angle = 20, vjust = 0.5, size = 20),
        axis.text.y = element_text(size = 20),
        axis.title.x = element_text(size = axis_label_size),
        axis.title.y = element_text(size = axis_label_size),
        legend.text = element_text(size = 20),
        legend.title = element_text(size = 20),
        plot.title = element_text(size = axis_label_size)
      ) +
      guides(color = guide_legend(override.aes = list(size = 10)), alpha = guide_legend(override.aes = list(size = 10)))
    dev.off()
    
  2. R code for bubbleplot2

    library(readxl)
    library(ggplot2)
    library(dplyr)
    library(magrittr)
    library(tidyr)
    library(forcats)
    
    # Read data from an Excel file
    mydat <- read_excel("1457_M10_atlE_DEGs_all_pathway-2.xlsx")
    
    mydat$Comparison <- factor(mydat$Comparison, levels=c("1457","1457-M10","1457∆atlE"))
    
    description_order <- rev(c("Protein processing in endoplasmic reticulum","TNF signaling pathway","Legionellosis","Epstein-Barr virus infection","Toxoplasmosis","Osteoclast differentiation","Proteasome","Influenza A","Herpes simplex infection","HIF-1 signaling pathway","NOD-like receptor signaling pathway","Apoptosis","C-type lectin receptor signaling pathway","MAPK signaling pathway","Endocytosis","Neurotrophin signaling pathway","Ubiquitin mediated proteolysis","Pancreatic cancer"))
    mydat$Description <- factor(mydat$Description, levels = description_order)
    
    png("bubble_plot2.png", 1000, 800)
    ggplot(mydat, aes(y = Description, x = Comparison)) +
      geom_point(aes(color = p.adjust), size = 10) + # Set fixed size for points
      labs(x = "", y = "", alpha="-log10(p.adjust)") +
      theme(axis.text.x = element_text(angle = 20, vjust = 0.5)) +
      theme(axis.text = element_text(size = 20)) +
      theme(legend.text = element_text(size = 20)) +
      theme(legend.title = element_text(size = 20)) +
      guides(size = "none") # Turn off size in legend
    dev.off()
    
  3. Input Excel for bubbleplot

    Description Size    Expect  Ratio   P Value FDR Category    Count
    TNF signaling pathway   110 93/373  42/839  2.22E-12    7.24E-10    Up-regulated    42
    Legionellosis   55  46/686  55/691  2.9E-10 4.72E-08    Up-regulated    55
    Cytokine-cytokine receptor interaction  294 24/956  26/046  1.95E-09    2.12E-07    Up-regulated    26
    Protein processing in endoplasmic reticulum 165 14/006  30/701  1.01E-07    8.27E-06    Up-regulated    30
    Toxoplasmosis   113 95/919  35/447  2.2E-07 1.43E-05    Up-regulated    35
    Fluid shear stress and atherosclerosis  139 11/799  31/359  1.52E-06    8.26E-05    Up-regulated    31
    Pathways in cancer  526 44/649  19/037  1.92E-05    0.000896    Up-regualted    19
    JAK-STAT signaling pathway  162 13/751  27/634  4.35E-05    0.00177 Up-regulated    27
    IL-17 signaling pathway 93  78/942  32/935  0.000285    1.0327E-06  Up-regulated    32
    Influenza A 171 14/515  25/491  0.000687    2.241E-06   Up-regulated    25
    Transcriptional misregulation in cancer 186 83/425  23/974  0.0002368   0.038559    Down-regulated  23
    Glycine serine and threonine metabolism 40  17/941  44/591  0.00032864  0.038559    Down-regulated  44
    Antifolate resistance   31  13/904  50/345  0.00035484  0.038559    Down-regulated  50
    Base excision repair    33  14/801  47/294  0.00053358  0.043487    Down-regulated  47
    Metabolic pathways  1305    58/532  12/984  0.0075183   0.40292 Down-regulated  12
    Acute myeloid leukemia  66  29/602  27/025  0.008932    0.40292 Down-regulated  27
    Homologous recombination    41  18/389  32/628  0.0092737   0.40292 Down-regulated  32
    Fanconi anemia pathway  54  24/220  28/902  0.0098875   0.40292 Down-regulated  28
    Primary immunodeficiency    37  16/595  30/129  0.023604    0.77755 Down-regulated  30
    MAPK signaling pathway  295 13/231  15/871  0.023851    0.77755 Down-regulated  15
    
  4. Input Excel for bubbleplot2

    Comparison  Description p.adjust
    1457    Protein processing in endoplasmic reticulum 6.7681E-08
    1457-M10    Protein processing in endoplasmic reticulum 4.6253E-06
    1457∆atlE   Protein processing in endoplasmic reticulum 2.0787E-05
    1457    TNF signaling pathway   2.6941E-05
    1457-M10    TNF signaling pathway   4.5734E-06
    1457∆atlE   TNF signaling pathway   3.3099E-06
    1457    Legionellosis   0.0062434
    1457-M10    Legionellosis   4.6253E-06
    1457∆atlE   Legionellosis   0.0073192
    1457    Epstein-Barr virus infection    0.0062434
    1457-M10    Epstein-Barr virus infection    1.6635E-06
    1457∆atlE   Epstein-Barr virus infection    0.00049454
    1457    Toxoplasmosis   0.0064469
    1457    Osteoclast differentiation  4.6509E-06
    1457-M10    Osteoclast differentiation  2.6616E-05
    1457    Proteasome  1.0391E-05
    1457    Influenza A 1.4677E-05
    1457    Herpes simplex infection    1.5915E-05
    1457∆atlE   Herpes simplex infection    1.857E-06
    1457    HIF-1 signaling pathway 1.6873E-05
    1457-M10    NOD-like receptor signaling pathway 2.22E-06
    1457∆atlE   NOD-like receptor signaling pathway 0.0096378
    1457-M10    Apoptosis   9.54E-06
    1457-M10    C-type lectin receptor signaling pathway    1.37E-05
    1457-M10    MAPK signaling pathway  5.3439E-05
    1457-M10    Endocytosis 5.49E-05
    1457∆atlE   Endocytosis 1.857E-06
    1457∆atlE   Neurotrophin signaling pathway  0.00049454
    1457∆atlE   Ubiquitin mediated proteolysis  0.0088734
    1457∆atlE   Pancreatic cancer   1.857E-06
    

like unlike

点赞本文的读者

还没有人对此文章表态


本文有评论

没有评论

看文章,发评论,不要沉默


© 2023 XGenes.com Impressum