Contributors | Affiliation | Role |
---|---|---|
Vega Thurber, Rebecca | Oregon State University (OSU) | Principal Investigator |
Muller, Erinn M. | Mote Marine Laboratory (Mote) | Co-Principal Investigator |
Klinges, Grace J. | Mote Marine Laboratory (Mote) | Scientist |
Merchant, Lynne M. | Woods Hole Oceanographic Institution (WHOI BCO-DMO) | BCO-DMO Data Manager |
Samples of coral tissue, skeleton, and mucus were taken from two genotypes of Acropora cervicornis prior to nutrient enrichment (n = 20 per genotype), prior to disease exposure (n = 18 per genotype), and at various stages during disease development. All surviving ramets at one week after disease exposure were sampled. To sample each coral, 6-8 polyps were excised using a flame-sterilized blade and placed in a 1.5mL microcentrifuge tube containing 1mL of DNA/RNA shield (Zymo Research, R1100-250, Irvine, CA, USA). Samples were transferred to a -80℃ freezer for long-term storage. In preparation for RNA extractions, the samples were removed from the -80℃ freezer and thawed on ice. With flame-sterilized tweezers, half of the biomass was transferred to a Disruptor Tube (Omega Bio-Tek, Norcross, GA, USA), the other half was kept as a bioarchive and returned to -80℃. RNA from each sample was isolated utilizing the E.Z.N.A. DNA/RNA Isolation Kit (Omega Bio-Tek, Norcross, GA, USA) with slight modifications to the manufacturer’s protocol to increase yield. RNA isolates were stored at -80℃. DNA quantity and quality was assessed utilizing a NanoDrop spectrophotometer (Thermo Fisher Scientific™, Waltham, MA, USA). Samples were shipped on dry ice to the Oklahoma Medical Research Foundation NGS Core, where RNA cleanup, precipitation, and polyA selection was performed. Libraries were prepared using the IDT xGen RNA Library kit. Final QC was performed using KAPA qPCR and Agilent Tapestation to confirm rRNA content, and libraries were sequenced on a NovaSeq 6000 using S4 chemistry.
A total of 42 samples were successfully sequenced on an Illumina NovaSeq 6000 with S4 chemistry and PE 150 bp reads. Prior to quality filtering, an average single-end read depth of 11,084,445.3 +/- 970,330.08 was produced from sequencing. After filtration, an average of 10,954,237.2 +/- 982,482.11 reads remained per read direction. From quality filtered sequences, 72.54% of single end reads mapped to the A. cervicornis transcriptome using STAR. Quantification using Salmon resulted in 24,875 genes having at least one count across all samples, with subsequent filtering (less than 1 count in >10 samples) reducing this to 12,913 genes for downstream analysis. Of reads not aligning to the A. cervicornis transcriptome, an average of 22.54% aligned to the Symbiodinium (Clade A) reference transcriptome using STAR. Quantification using Salmon yielded counts for 73,112 transcripts, with 26,225 of these retained for downstream analysis after filtering (less than 1 count in greater than 10 samples). Analysis of differentially expressed transcripts is ongoing.
Four files were joined together to create the final dataset.
This is the process to get to the final dataset.
1) To join the files metadata_rnaseq.xlsx with Accession_numbers_nut_dis_rnaseq.tsv, one would like to use the unique values of column "ID" in metadata_rnaseq.xlsx (example value of 36-C1-DT1) and the unique values of column "sample_title" in Accession_numbers_nut_dis_rnaseq.tsv (example value of 36-C1-DT1) since they look to have the same values and format.
But there is a problem joining on row 6. The row 6 value of "ID" is 36-C15 in metadata_rnaseq.xlsx, and the row 6 value of "sample title" is 36-C15-DT1 in Accession_numbers_nut_dis_rnaseq.tsv. It looks like the row 6 value of "ID" in metadata_rnaseq.xlsx should be 36-C15-DT1 rather than 36-C15 because the "full_name" value in that row is 36.C15.DT1.7.25. This is assumed from the patterns of all the other row entries. And in row 6 of Accession_numbers_nut_dis_rnaseq.tsv, the value of "full_name" is 36.C15.DT1.7.25.
Also, the column "ID" looks to be the value in the column "full_name" with the date portion removed. This implies the row 6 value of "rnaseq_name" should be 36-C15-DT1_7/25/22 if the row 6 value of ID is changed to 36-C15-DT1.
Rows 6 & 7 are listed below to show the pattern of each column.
Row 6 & 7 of metadata_rnaseq.xlsx
full_name rnaseq_name replicate ID
36.C15.DT1.7.25 36-C15_7/25/22 AC36-C15 36-C15
36.C16.DT1.7.24 36-C16-DT1_7/24/22 AC36-C16 36-C16-DT1
Row 6 & 7 of Accession_numbers_nut_dis_rnaseq.tsv
accession message sample_name sample_title
SAMN38581337 Successfully loaded 36-C15-DT1_7/25/22 36-C15-DT1
SAMN38581338 Successfully loaded 36-C16-DT1_7/24/22 36-C16-DT1
2) To join the files metadata_rnaseq.xlsx with Accession_numbers_nut_dis_rnaseq.tsv, the next best columns are "full_name" in metadata_rnaseq.xlsx and "sample_name" in Accession_numbers_nut_dis_rnaseq.tsv. A sample value of "full_name" is 36.C15.DT1.7.25 in metadata_rnaseq.xlsx and a sample value of "sample_name" is 36-C15-DT1_7/25/22 in Accession_numbers_nut_dis_rnaseq.tsv.
To perform that join, the column "sample_name" was duplicated in Accession_numbers_nut_dis_rnaseq.tsv and reformatted by replacing the dash with a period, the underscore with a period, the forward slash with a period, and the year was removed. As an example, the format of 36-C15-DT1_7/25/22 was converted to 36.C15.DT1.7.25 and now a join can be performed because the format of the columns are identical.
3) While trying to join the file metadata_rnaseq.xlsx with the file Accession_numbers_nut_dis_rnaseq.tsv, it was noticed that there is a difference of values in row 34. In the file metadata_rnaseq.xlsx, the row 34 value of "full_name" is 46.N13.DT1A.7.23 and in the file Accession_numbers_nut_dis_rnaseq.tsv, the row 34 value of "sample_name" is 46-N13-DT1_7/23/22. Aside from the different format, there is an A in the "full_name" value and no A in the "sample_name" value.
In the file metadata_rnaseq.xlsx, other column values in the row 34 are also missing an A such as "rnaseq_name" and "ID".
In the file metadata_rnaseq.xlsx, this is part of row 34
full_name rnaseq_name replicate ID
46.N13.DT1A.7.23 46-N13-DT1_7/23/22 AC46-N13 46-N13-DT1
In the file Accession_numbers_nut_dis_rnaseq.tsv, this is part of row 34
accession message sample_name sample_title
SAMN38581365 Successfully loaded 46-N13-DT1_7/23/22 46-N13-DT1
4) For now, the "full_name" value of 46.N13.DT1A.7.23 was changed to 46.N13.DT1.7.23 by removing the 'A'.
The joined table was named join_1.
5) In the table join_1, the columns "bioproject_accession" and "host" were removed because they only contain empty values. Once the bioproject_accession value is known, this column will be added back in.
6) Any columns with duplicate values of another column were removed.
7) The column "message" which contained the value "Successfully loaded" was removed because it is related to the NCBI data submission and has no relevant information to the data themselves.
8) Then the table join_1 was joined with the file Nut_Dis_RNAseq_Invertebrate.1.0.xlsx on the column "sample_title" which is common to both files. The resulting joined table was named join_2.
9) Columns with duplicate values of other columns in table join_2 named sample_name, sample_title, isolate, organism, isolation_source, geo_loc_name, collection_date, Date, tissue, Genotype, Treatment, Exposure, Disease_exposure, Health_status, and group were removed.
10) The column "breed" in table join_2 was removed because it holds the value "not applicable" which is specific to filling out the NCBI data submission and has no relevant information for the data themselves. Is this true?
11) The columns in table join_2 named "bioproject_accession"" and "host" were removed because all the values are blank. Once the bioproject_accession value is known, this column will be added back in.
12) There is one remaining submitted file to join with called Nut_Dis_RNAseq_Invertebrate.1.0.xlsx.
13) Table join_2 looks to be capable of joining with the file SRA_metadata_NutDis_rnaseq.xlsx using the unique values of "full_name" in join_2 and the unique values of "library_ID" in Nut_Dis_RNAseq_Invertebrate.1.0.xlsx.
14) A problem with joining on "full_name" and "library ID" is row 34 which has a value of 46.N13.DT1A.7.23 in the column "library_ID" and a value of 46.N13.DT1.7.23 in the column "full_name" of join_2. There is an 'A' in the "library_ID" value that does not appear in the "full_name" value.
15) For now, the value in row 34 in the column "library_ID" in the file SRA_metadata_NutDis_rnaseq.xlsx is changed from 46.N13.DT1A.7.23 to 46.N13.DT1.7.23 by removing the 'A'. Another reason to remove the 'A' is that there was a removal of the 'A' earlier in the column "full_name" in row 34 from the value of 46.N13.DT1A.7.23 of the file metadata_rnaseq.xlsx.
In the file SRA_metadata_NutDis_rnaseq.xlsx, this is part of row 34
sample_name library_ID
46-N13-DT1_7/23/22 46.N13.DT1A.7.23
16) After the removal of 'A' from the value in row 34 of the column "library_ID", table join_2 is joined with the file SRA_metadata_NutDis_rnaseq.xlsx. The unique column "full_name" was used from join_2 and the unique column "library_ID" was used from the file SRA_metadata_NutDis_rnaseq.xlsx to create the join. The resulting table was named join_3.
17) Empty columns in table join_3 were removed.
18) Columns containing file information, such as file name, for the files submitted to NCBI were removed since they didn't contain relevant information for the dataset and only contained relevant information for the NCBI submission. If the file information is required, it can be found by following the NCBI accession values of the dataset.
19) The column "breed" containing the value "not applicable" was removed due to containing unknown NCBI specific information for the submission of the dataset to NCBI and was not relevant to the dataset itself.
20) Columns containing instrument values were removed because the instrument used will be noted on the dataset page in the Instruments section.
21) The duplicate column of the column "full_name" was removed.
22) For now, both the columns "sample_name" and "rnaseq_name" are retained because there are differing values in row 6 for otherwise duplicate columns. The row 6 value of "sample_name" is 36-C15-DT1_7/25/22 and the row 6 value of "rnaseq_name" is 36-C15_7/25/22 which is missing the "DT1" portion.
23) For now, both the columns "sample_title" and "ID column" are retained because there are differing values in row 6 for otherwise duplicate columns. The row 6 value of "sample_title" is 36-C15-DT1 and the row 6 value of "ID" is 36-C15. The "DT1" portion is missing from the row 6 value of "ID" as compared to the row 6 value of "sample_title".
In the final table join_3, this is part of row 6
accession sample_name rnaseq_name library_ID sample_title ID replicate
SAMN38581337 36-C15-DT1_7/25/22 36-C15_7/25/22 36.C15.DT1.7.25 36-C15-DT1 36-C15 AC36-C15
24) The summary of bullets 22 & 23 is that for row 6 of the final table join_3, the columns "rnaseq_name" and "ID" are missing the "DT1" portion that columns "sample_name" and "sample_title" contain.
25) Any column name with an asterisk had the asterisk removed to follow the BCO-DMO parameter naming conventions.
Dataset-specific Instrument Name | Illumina Nova Seq 6000 |
Generic Instrument Name | Automated DNA Sequencer |
Generic Instrument Description | General term for a laboratory instrument used for deciphering the order of bases in a strand of DNA. Sanger sequencers detect fluorescence from different dyes that are used to identify the A, C, G, and T extension reactions. Contemporary or Pyrosequencer methods are based on detecting the activity of DNA polymerase (a DNA synthesizing enzyme) with another chemoluminescent enzyme. Essentially, the method allows sequencing of a single strand of DNA by synthesizing the complementary strand along it, one base pair at a time, and detecting which base was actually added at each step. |
Dataset-specific Instrument Name | |
Generic Instrument Name | Thermo Scientific NanoDrop spectrophotometer |
Generic Instrument Description | Thermo Scientific NanoDrop spectrophotometers provide microvolume quantification and purity assessments of DNA, RNA, and protein samples. NanoDrop spectrophotometers work on the principle of ultraviolet-visible spectrum (UV-Vis) absorbance. The range consists of the NanoDrop One/OneC UV-Vis Spectrophotometers, NanoDrop Eight UV-Vis Spectrophotometer and NanoDrop Lite Plus UV Spectrophotometer. |
NSF Award Abstract:
Historically one of the most abundant reef-building corals in Florida and the wider Caribbean, the staghorn coral, Acropora cervicornis, is now listed as critically endangered primarily because of previous and reoccurring disease events. Understanding the holistic mechanisms of disease susceptibility in this coral is a top concern of practitioners engaged in conservation and restoration. The investigators recently discovered a group of parasitic bacteria common within the microbial community of A. cervicornis that can reduce the growth and health of corals when reefs are exposed to nutrient polluted waters. Determining how interactions among the coral host, this parasitic microbe, and the environment are linked to disease susceptibility provides critical insight and greater success of future restoration efforts. Yet the complexity of animal microbiomes and the contextual nature of disease make it difficult to identify the specific cause of many disease outbreaks. In this project, the investigators conduct experiments to explore the interactions among different genetic strains of coral and these bacteria in various nutrient scenarios to better understand how this bacterium affects the susceptibility of staghorn coral to diseases. This project also characterizes the genomics, host range, and local and global distribution of this bacterial coral parasite to determine how its evolutionary history and physiology drive disease susceptibility in this important coral species. The project trains two postdocs, one technician, and seven students (one graduate, six undergraduates) in integrative sciences that span marine science, physiology, genetics, microbiology, omics, and statistical modeling. A research-based after school program in Florida is expanded to include microbiology and create a new program module called Microbial warriors, with a focus on women in science. The investigators produce documentary style films and outreach materials to broadly communicate the project science and conservation efforts to local and national communities via presentations at Mote Marine Lab and the Oregon Museum of Science and Industry. This project is co-funded by the Biological Oceanography Program in the Division of Ocean Sciences and the Symbiosis, Defense, and Self-recognition Program in the Division of Integrative Organismal Systems.
The investigators recently identified a marine Rickettsiales bacterium that, in corals, can be stimulated to grow in the presence of elevated nitrogen and phosphorous species. Based on genomic reconstruction and phylogeography, this bacteria is classified as a novel bacterial genus, Candidatus Aquarickettsia, and showed that it is broadly associated with scleractinian corals worldwide. Importantly, using a model system, the endangered Acropora cervicornis coral, the team has also shown that the growth of this bacterium in vivo is associated with reduced host growth and increased disease susceptibility. This project aims to more completely evaluate the mechanisms behind and impacts of these inducible infections on coral physiology and host-bacterial symbiosis. The investigators conduct nutrient dosing experiments on different coral genotypes with various Rickettsiales abundances. Using a range of omics and microscopy techniques, the team quantifies the resulting effects on holobiont phenotypes. The investigators are also comparing the genomes of these bacteria in the different Acroporid hosts and other coral genera to evaluate facets of the bacterium's evolutionary history, as well as to identify possible mechanisms of its proliferation, virulence, and host specificity. This interdisciplinary project mechanistically links nutrients to temporal changes in host, algal symbiont, and bacterial parasite physiology and also explain why there is natural variation in these responses by exploring how host and parasite genotypes and growth dynamics combined with environmental contextuality alter holobiont phenotypes.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Funding Source | Award |
---|---|
NSF Division of Ocean Sciences (NSF OCE) | |
NSF Division of Ocean Sciences (NSF OCE) |