Total spectral counts of peptides from the R/V Falkor cruise 160115 in the Central Pacific for the ProteOMZ expedition in 2016

Website: https://www.bco-dmo.org/dataset/737596
Data Type: Cruise Results
Version: 1
Version Date: 2022-06-03

Project
» The ProteOMZ Expedition: Investigating Life Without Oxygen in the Pacific Ocean (ProteOMZ (Proteomics in an Oxygen Minimum Zone))
ContributorsAffiliationRole
Saito, Mak A.Woods Hole Oceanographic Institution (WHOI)Principal Investigator
Ake, HannahWoods Hole Oceanographic Institution (WHOI BCO-DMO)BCO-DMO Data Manager
York, Amber D.Woods Hole Oceanographic Institution (WHOI BCO-DMO)BCO-DMO Data Manager

Abstract
Relative protein abundance data of the upper 1000m water column from the ProteOMZ R/V Falkor expedition. There are 109952 unique peptides, each with spectral counts associated with each of the 102 samples, for 10 million data points.


Coverage

Location: Central Pacific
Spatial Extent: N:17.4465 E:-139.1089 S:-0.4708 W:-157.3022
Temporal Extent: 2016-01-19 - 2016-01-28

Dataset Description

These data are part of the Ocean Protein Portal "ProteOMZ" dataset v3 (https://proteinportal.whoi.edu/; Saito et al., 2019).


Data Processing Description

The raw mass spectra files were searched against SEQUEST within Proteome Discoverer v2.2 software. Processed files were then loaded into Proteome Software and protein and peptide reports as well as and fasta files were exported. The files were modified slightly to map to the Protein Portal data model for submission to BCO-DMO. The peptide report was too large to work with within Excel and was modified in Pandas/Python to produce a CSV file.
 


BCO-DMO Processing Description

Preprocessing:
-Date, time, filter min, filter max, lat, lon, and cruise columns added based on information from the Falkor 160115 Event log and CTD log.
-Column names reformatted to comply with BCO-DMO standards.

Dataset Version 1: This file version (2022-06-03) replaces a previous revision of dataset version 1 from 2019-02-24.
* Data from source file "ProteOMZ_peptides_for_OPP.csv" was imported into the BCO-DMO data system for this dataset. This file "ProteOMZ_peptides_for_OPP.csv" is from Ocean Protein Portal "ProteOMZ" dataset v3 (file version 2022-06-03)
** In the BCO-DMO data system missing data identifiers are displayed according to the format of data you access. For example, in csv files it will be blank (null) values. In Matlab .mat files it will be NaN values. When viewing data online at BCO-DMO, the missing value will be shown as blank (null) values.

* Column names adjusted to conform to BCO-DMO naming conventions designed to support broad re-use by a variety of research tools and scripting languages. [Only numbers, letters, and underscores.  Can not start with a number] e.g. date_y-m-d changed to date_ymd

* ISO DateTime with timezone (UTC) column added in ISO 8601 format from local date and times in HST.

* Data table attached to dataset as Data File:"737596_v1_proteomz-peptides.csv"


[ table of contents | back to top ]

Data Files

File
737596_v1_proteomz-peptides.csv
(Comma Separated Values (.csv), 775.75 MB)
MD5:2a69f9ed3937e0384f9d9b453108dd63
Primary data file for dataset ID 737596, version 1

[ table of contents | back to top ]

Related Datasets

IsSupplementTo
Saito, M. A. (2024) Total spectral count of proteins from R/V Falkor cruise 160115 for the ProteOMZ expedition in the Central Pacific in 2016. Biological and Chemical Oceanography Data Management Office (BCO-DMO). (Version 3) Version Date 2022-06-06 doi:10.26008/1912/bco-dmo.737620.3 [view at BCO-DMO]

[ table of contents | back to top ]

Parameters

ParameterDescriptionUnits
sample_id

Unique sample name for the specific filter collected (station/depth/version if applicable)

unitless
MS_MS_sample_name

Unique name for the mass spec sample and run

unitless
protein_id

The specific name of the full protein length sequence assembled in the metagenome that was used for peptide identification

unitless
protein_molecular_weight_kDa

Molecular weight of the full length protein sequences

kDa
best_protein_id_probability

Probability of the protein assignment for the peptide

unitless
peptide_sequence

Amino acid sequence of the identified peptide. Unique Peptide sequence; this is the most unique identifier

unitless
peptide_start_index

Starting amino acid in the full length protein sequence for the identified peptide

unitless
peptide_stop_index

Stopping amino acid in the full length protein sequence for the identified peptide

unitless
plus2H_spectra_count

Number of identified spectral counts for a peptide of the +2 charge state

count
plus3H_spectra_count

Number of identified spectral counts for a peptide of the +3 charge state

count
plus4H_spectra_count

Number of identified spectral counts for a peptide of the +4 charge state

count
best_sequest_DCn_score

Delta CN score for peptide spectrum match (PSM). Metric of peptide quality

unitless
best_sequest_Xcorr_score

XCorr score for peptide spectrum match (PSM). Metric of peptide quality

unitless
median_retention_time

Median amount of time for a peptide in LC before it was identified via MS/MS

minutes
total_precursor_intensity

Total precursor intensity.

unitless
TIC

Total Ion Chromatogram (TIC).

unitless
spectral_count_sum

The sum of spectral counts for all peptide proton ionzation states. Sum of +2 +3 +4 data = total unnormalized spectral counts; Quantitative Value

count
other_protein_ids

All other possible proteins in the metagenome that contain the same peptide as the protein assigned. Other protein IDs from the FASTA file (see Related Datasets)

unitless
station_id

Station number

unitless
depth_m

Depth of sampling

meters
latitude_dd

Latitude of station

decimal degrees
longitude_dd

Longitude of station

decimal degrees
date_ymd

Date of sampling; (local time zone HST)

unitless
time_hms

Time of sampling; (local time zone HST)

unitless
minimum_filter_size_microns

Minimum filter size

microns
maximum_filter_size_microns

Maximum filter size

microns
cruise_id

The unique cruise identifier

unitless
ISO_DateTime_UTC

DateTime with timezone (UTC)of sampling in ISO 8601 format

unitless


[ table of contents | back to top ]

Instruments

Dataset-specific Instrument Name
Alpkem Autosampler
Generic Instrument Name
Alpkem RFA300
Dataset-specific Description
Used in nutrient analysis
Generic Instrument Description
A rapid flow analyser (RFA) that may be used to measure nutrient concentrations in seawater. It is an air-segmented, continuous flow instrument comprising a sampler, a peristaltic pump which simultaneously pumps samples, reagents and air bubbles through the system, analytical cartridge, heating bath, colorimeter, data station, and printer. The RFA-300 was a precursor to the smaller Alpkem RFA/2 (also RFA II or RFA-2).

Dataset-specific Instrument Name
SeaBird SBE19 CTD
Generic Instrument Name
CTD Sea-Bird
Dataset-specific Description
Used for water sampling
Generic Instrument Description
Conductivity, Temperature, Depth (CTD) sensor package from SeaBird Electronics, no specific unit identified. This instrument designation is used when specific make and model are not known. See also other SeaBird instruments listed under CTD. More information from Sea-Bird Electronics.

Dataset-specific Instrument Name
Technicon AutoAnalyzer II
Generic Instrument Name
Technicon AutoAnalyzer II
Dataset-specific Description
Used to measure phosphate and ammonium
Generic Instrument Description
A rapid flow analyzer that may be used to measure nutrient concentrations in seawater. It is a continuous segmented flow instrument consisting of a sampler, peristaltic pump, analytical cartridge, heating bath, and colorimeter. See more information about this instrument from the manufacturer.

Dataset-specific Instrument Name
Trace Metal Rosette
Generic Instrument Name
Trace Metal Bottle
Dataset-specific Description
Used for nutrient sampling
Generic Instrument Description
Trace metal (TM) clean rosette bottle used for collecting trace metal clean seawater samples.


[ table of contents | back to top ]

Deployments

FK160115

Website
Platform
R/V Falkor
Report
Start Date
2016-01-16
End Date
2016-02-11
Description
Project: Using Proteomics to Understand Oxygen Minimum Zones (ProteOMZ) More information is available from the ship operator at https://schmidtocean.org/cruise/investigating-life-without-oxygen-in-the... Additional cruise information is available from the Rolling Deck to Repository (R2R): https://www.rvdata.us/search/cruise/FK160115


[ table of contents | back to top ]

Project Information

The ProteOMZ Expedition: Investigating Life Without Oxygen in the Pacific Ocean (ProteOMZ (Proteomics in an Oxygen Minimum Zone))


Coverage: Central Pacific Ocean (Hawaii to Tahiti)


From Schmidt Ocean Institute's ProteOMZ Project page:

Rising temperatures, ocean acidification, and overfishing have now gained widespread notoriety as human-caused phenomena that are changing our seas. In recent years, scientists have increasingly recognized that there is yet another ingredient in that deleterious mix: a process called deoxygenation that results in less oxygen available in our seas.

Large-scale ocean circulation naturally results in low-oxygen areas of the ocean called oxygen deficient zones (ODZs). The cycling of carbon and nutrients – the foundation of marine life, called biogeochemistry – is fundamentally different in ODZs than in oxygen-rich areas. Because researchers think deoxygenation will greatly expand the total area of ODZs over the next 100 years, studying how these areas function now is important in predicting and understanding the oceans of the future. This first expedition of 2016 led by Dr. Mak Saito from the Woods Hole Oceanographic Institution (WHOI) along with scientists from University of Maryland Center for Environmental Science, University of California Santa Cruz, and University of Washington aimed to do just that, investigate ODZs.

During the 28 day voyage named “ProteOMZ,” researchers aboard R/V Falkor traveled from Honolulu, Hawaii to Tahiti to describe the biogeochemical processes that occur within this particular swath of the ocean’s ODZs. By doing so, they contributed to our greater understanding of ODZs, gathered a database of baseline measurements to which future measurements can be compared, and established a new methodology that could be used in future research on these expanding ODZs.



[ table of contents | back to top ]

Funding

Funding SourceAward
Gordon and Betty Moore Foundation: Marine Microbiology Initiative (MMI)
Schmidt Ocean Institute (SOI)

[ table of contents | back to top ]