Tests for Quality Control¶

An automatic quality control is typically a composition of checks, each one looking for a different aspect to identify bad measurements. This section covers the concept of the available checks and some ways how those could be combined.

A description and references for each test are available in qctests. The result of each test is a flag ranking the quality of the data as described in Flags. Finally, most of the users will probably follow one of the recommended procedures (GTSPP, Argo, QARTOD …) described in Quality Control Procedures. If you are not sure what to do, start with one of those QC procedures and later fine tune it for your needs. The default procedure for CoTeDe is the result of my experience with the World Ocean Database.

Flags¶

The outcome of the QC evaluation is encoded following the IOC recommendation given in the table below For example, if the climatology database is not available, the output flag would be 0, while a fail on the same climatology test would return a flag 3, if following the GTSPP recommendations. By the end of all checks, each measurement receives an overall flag that is equal to the highest flag among all tests applied. Therefore, one mesurement with flag 0 was not evaluated at all, while a measurement with overall flag 4 means that at least one check considered that a bad data.

Flag	Meaning
0	No QC was performed
1	Good data
2	Probably good data
3	Probably bad data
4	Bad data
9	Missing data

The flags 2 and 3 usually cause some confusion: “What do you mean by probably good or bad?” The idea is to allow some choice for the final user. The process of defining the criteria for any QC test is a duel between minimizing false positives or false negatives, thus it is a choice: What is more important for you? There is no unique answer for all cases. Most of the users will use anything greater than 2 as non-valid measurements. Someone willing to pay the price of loosing more data to avoid by all means any bad data would rather discard anything greater than 1. While someone more concerned in not wasting any data, even if that means a few mistakes, would discard anything greater than 3. When designing a test or defining a new threshold, please assume that flag 4 is pretty confident that is a bad measurement.

It is typically expected to have one flag for each measurement in the dataset, but it is possible to have a situation with a single flag for the whole dataset. For instance, if a profile is checked only for a valid geolocation, it would get a single flag for the whole profile.

Some procedures also provide a continuous scale usually representing the probablity of a measurement being good, like the Anomaly Detection and the Fuzzy Logic. For details on that, please check the description of the specific check.

Quality Control Procedures¶

Although I slightly modified the names of some Q.C. test, the concept behind is still the same. The goal was to normalize all tests to return True if the data is good and False if the data is bad. For example, Argo’s manual define “Impossible Date Test”, while here I call it “Valid Date”.

Profile¶

GTSPP¶

Test	Flag		Threshold
	if succeed	if fail	Temperature	Salinity
Valid Date	1	4
Valid Position	1
Location at Sea	1
Global Range	1		-2 to 40 C	0 to 41
Gradient	1	4	10.0 C	5
Spike	1		2.0 C	0.3
Climatology	1
Profile Envelop

EuroGOOS¶

Test	Flag		Threshold
	if succeed	if fail	Temperature	Salinity
Valid Date	1	4
Valid Position	1	4
Location at Sea	1	4
Global Range	1	4	-2.5 to 40	2 to 41
Digit Rollover	1	4	10.0 C	5
Gradient Depth Conditional < 500 > 500	1	4	9.0 C 3.0 C	1.5 0.5
Spike Depth Conditional < 500 > 500	1	4	6.0 C 2.0 C	0.9 0.3
Climatology	1

Argo (Incomplete)¶

Test	Flag		Threshold
	if succeed	if fail	Temperature	Salinity
Platform Identification
Valid Date
Impossible location test
Position on land test
Impossible speed test
Global Range
Regional Range
Pressure increasing
Spike
Top an dbottom spike test: obsolete
Gradient (obsolete in 2020)
Digit Rollover
Stuck value test
Density Inversion
Grey list
Gross salinity or temperature sensor drift
Visual QC
Frozen profile test
Deepest pressure test

IMOS (Incomplete)¶

Test	Flag		Threshold
	if succeed	if fail	Temperature	Salinity
Valid Date	1	3
Valid Position	1	3
Location at Sea	1	3
Global Range	1		-2.5 to 40	2 to 41
Gradient	1	4	10.0 C	5
Spike	1		2.0 C	0.3
Climatology	1

QARTOD (Incomplete)¶

Test	Flag		Threshold
	if succeed	if fail	Temperature	Salinity
Gap
Syntax
Location at Sea
Gross Range
Climatological
Spike
Rate of Change		4
Flat Line
Multi-Variate
Attenuated Signal
Neighbor
TS Curve Space
Density Inversion		3	0.03 kg/m3

TSG¶

Based on AOML procedure. Realtime data is evaluatd by tests 1 to 10, while the delayed mode is evaluated by tests 1 to 15.

Platform Identification

Valid Date

Impossible Location

Location at Sea

Impossible Speed

Global Range

Regional Range

Spike

Constant Value

Gradient

Climatology

NCEP Weekly analysis

Buddy Check

Water Samples

Calibrations