WildHealth Data Model
The WildHealth Data Model is designed accommodate wildlife health data from various sources, including local to national wildlife health surveillance systems, research initiatives, and citizen-science based projects.
Wildlife health surveillance (WHS) is critical for addressing health hazards threatening wildlife and undermining One Health. It involves a comprehensive set of activities aimed at continuously and rapidly generating and analyzing information on the health of wildlife and associated hazards. WHS systems, the organized set of resources and procedures to conduct WHS, generate information on health state (e.g., trace minerals, immune function, pathogens, toxic agents), diseases, and their population-level effects in wildlife, with the objective to prevent, mitigate, control, or eliminate health hazards and support population resilience.
As a result, WHS systems rely on diverse data, including environmental features, animal observations, specimen collections, necropsy findings, diagnostic tests results, photographic records, video footage, etc., generated by many actors through various methodologies. These diverse data need to be harmonized under a standardized structure and vocabulary to facilitate assessments and response, and their inclusion in multisectoral One Health frameworks.
On the other hand, the management of WHS data has been a historical problem that persists into the present. WHS data are often lost; harmonization and standardization remain poorly implemented. The Wildlife Health Intelligence Network developed this data model to address these challenges and promote best data practices.
Introduction
The data model contains units that are hierarchical organized capturing relevant epidemiological information at each step:
- Projects: A set of Surveillance Activities with a common leader and organizer (e.g., a national wildlife health surveillance system).
- Surveillance Activities: Specific sets of surveillance goals and methodology documented following a standard metadata format.
- Field Visits: Time period during which activities are conducted in the field.
- Locations: Specific areas where surveillance activities are conducted.
- Events: Epidemiological units with a spatiotemporal coordinate where wildlife health surveillance data is collected.
- Collections: Different methods and efforts to obtain Source Records from Sources within an Event.
- Sources: Units that can provide Specimens for hazard detection or whose health is assessed at a specific time. There are four Source types:
- Group: groups of animals of the same species.
- *Animal: Individual animals that can be observed, examined, sampled, and necropsied.
- Environmental: environmental sampling sites (e.g., ponds or feces in the field).
- Arthropod: arthropod capture sites.
- Source Records: Records of the Sources at a specific time.
- Specimens: Tissues taken from the Sources at a specific time.
- Diagnostics: Tests conducted on live animals, carcasses, or collected Specimens to assess hazards.
- Interpretation: The final status assigned to a Diagnostic, Specimen, or Source Record following documented case definitions.
The basic relationships among the basic units of the data model are shown in Figure 1 below:
Figure 1: Basic relationships among the basic units of the data model.
The data model adapts to the complexity of the wildlife health data, remaining simple for straightforward cases and becoming more intricate as data structure and variability increase. Many components of the model are conditional, meaning they are needed depending on the data structure. For example, consider wildlife mortality reported by a community member through a mobile application:
- The Surveillance Activity metadata must describe the methodology.
- The Field Visit could be defined annually to categorize reports by year.
- The Location could be a county or zip code.
- The Event is the epidemiological unit with latitude, longitude, and time representing dead animals found.
- No Specimens, Diagnostics, or Laboratory data are generated.
- No Collection effort is linked to the Events and the animals are found opportunistically.
- The animals are not grouped beyond the Field Visit, Location, and Events, so no clusters are needed.
Figure 2: Components of the data model to contain data from example 1.
This flexible structure ensures that the model remains efficient and scalable, adapting to different surveillance needs.
The data model is not designed for wildlife population monitoring. However, it includes key identifiers that enable seamless integration between wildlife health and population data.
More complex relationships between the fundamental units of the data model, as well as additional details, are covered in the Complexities section. Before exploring these advanced topics, it is recommended to first understand the core components of the model—from Project to Diagnosis—by continuing with this documentation. Use the menu in the top-right corner of this website to navigate to the relevant sections.
Main Units of the Data Model
Project
A Project in the data model represents a surveillance initiative supported by specific entities. Examples include the PREDICT Project funded by USAID, a single cross-sectional study with one field visit to a specific location (e.g., sample collection in a market), or a national or local wildlife health surveillance network managed by a government agency.
Projects serve as the highest hierarchical unit in the database and must contain at least one Surveillance Activity. They can be time-limited, spanning a single date, or ongoing for an extended period as needed.
Key properties of a Project include:
- Project ID
- Project Code
- Project Cross-Reference ID
- Project Cross-Reference Origin
- Project Leader
- Project Funding Source
Other attributes and provided in the Data Dictionary
Surveillance Activity
A Surveillance Activity in the data model refers to a coordinated set of activities aimed at detect diseases, demonstrate disease freedom, measure incidence or prevalence, assess trends in specific health and health hazards within defined populations. In general, a specific set of methods, strategies, and objectives should equal a specific Surveillance Activity.
For example, the longitudinal assessment of coronavirus shedding in two Eidolon helvum bat roosts in Africa involved collecting a fixed number of fecal samples from two bat roosts of the same species every month for 12 months, with subsequent testing for Coronaviridae sp.
Each Surveillance Activity includes detailed metadata describing its objectives and methods, such as:
- Targeted species, populations, and hazards
- Samples and collection methods
- Diagnostics and case definitions
A Surveillance Activity can span multiple Field Visits, Locations, and Source types (e.g., Groups, Animals, Environmental and Arthropod Sources).
Generally, Field Visits, Locations, Events, Sources, Source Records, and Diagnostics belong to a single Surveillance Activity. Furthermore, a Surveillance Activity usually includes Field Visits, Locations, Events, Sources, Source Records, and Diagnostics. However, exceptions exist (see Complexities section).
Most Surveillance Activity properties align with standard documentation of surveillance methods, including:
- Start and End Dates
- Targeted Hazards and Taxa
- Involved Organizations
- Definitions of Locations, Events, and Sources
- Source identification, Specimens, and Collection methods
- Diagnostic techniques and case definitions
Outbreak Investigation
Each outbreak investigation is considered a Surveillance Activity. Outbreak Surveillance Activities can encompass the initial Field Visit that led to the outbreak detection and subsequent Field Visits associated with the investigation and control. For instance, an outbreak may first be detected during routine patrols by rangers. In such cases, the initial Field Visit and its components should be assigned to both the ranger patrol Surveillance Activity and also the Outbreak Surveillance Activity. Follow-up Field Visits (e.g., by veterinarians or public health officers investigating the outbreak) should be assigned exclusively to the Outbreak Surveillance Activity (see Complexities section).
Field Visit
A Field Visit in the data model represents a defined time period—including a start and end date—during which Locations are visited, Events and Sources are identified, and Specimens are collected and documented.
A Field Visit may encompass multiple Locations within the same trip, such as markets, natural areas, rehabilitation centers, caves, and more. While each Field Visit must include at least one Location, there is no upper limit to the number of Locations that can be visited during the trip. The length of a Field Visit is flexible.
Key properties of a Field Visit include:
- Start and End Date
- Field Visit ID and Code
- Field Visit Cross-Reference ID and Origin
- Field Visit Leader
If there are a Field Visit attributes that are not part of the current data model but are of interest to track, they must be reported in the Surveillance Activity metadata. It is recommended to maintain such additional attributes in a separate system, such as another database or an Excel sheet, for reference. Common extra Field Visit attributes can be added to the data model in the future. Missing options for single- and multi-selection attributes of Field Visits can be added as long as they maintain a controlled vocabulary to ensure consistency and data integrity.
Location
In the data model, a Location represents a general area (polygon) where Events can be identified and Source Records, Carcasses, and Specimens can be collected. Unlike Events, which are defined by exact latitude and longitude coordinates, Locations serve as a broader spatial grouping for organizing data. A single Location can contain zero Events (e.g., when an area is surveyed but no Event is recorded) or multiple Events (e.g., multiple observations or findings within the same area).
The meaning of a Location depends on the methodology of the Surveillance Activity. For Arthropod collection, a Location may represent a parcel where traps are set. In a structured, hierarchical study, a Location could correspond to a grid cell.During ranger patrols, a Location might be defined as the entire protected area or a specific zone within it. Users must define the specific unit a Location represents and document it in the Surveillance Activity metadata.
Key attributes of a Location include:
- Location ID and Code
- Location Cross-Reference ID and Origin
- Location Type
- Environmental Characteristics
If there are a Location attributes that are not part of the current data model but are of interest to track, they must be reported in the Surveillance Activity metadata. It is recommended to maintain such additional attributes in a separate system, such as another database or an Excel sheet, for reference. Common extra Location attributes can be added to the data model in the future. Missing options for single- and multi-selection attributes of Location can be added as long as they maintain a controlled vocabulary to ensure consistency and data integrity.
Event
In the data model, an Event represents a distinct wildlife health event recorded at a specific longitude, latitude, and timestamp. Each Event is a point that can contain zero Collections (opportunistic detections of Sources) up to an unlimited number of Collections.
The definition of a “wildlife health event”, and therefore, what an Event represents in a Surveillance Activity will inevitably vary depending on the main epidemiological unit of interest and goals. Examples include:
- A site where Animals are captured at time t for Specimen collection
- The position and time a dead animal is found during a ranger patrol
- A position and time where water from a pond is sampled for analysis
- A site where one or several traps for vectors are deployed
In the case of beached fish, an Event can represent the position and time of recording of each individual dead fish in one extreme, or the total count of dead fish across the beach reported as a single point in the other extreme. In the context of a wet market, the definition of Event can be applied to the market, to vendors, to the stalls of vendors, or to the cages in the stalls at time t. For study A the Event could the grid cell where traps are deployed, whilst for study B it could each trap within a grid cell.
In some cases, the presence of at least one Source may be required to establish an Event. For example: a dead animal found during a ranger patrol at time t creates an Event.
In Active Surveillance, an Event might not have any Source, for example, an Event containing a Collection that ends with no animals captured.
An Event can include or not healthy animals. In the example of the dead animal found by the ranger at time t, it could be possible that the Event definition includes the documentation of healthy animals or not.
An Event can involve more than one Source type. For example, mosquito traps (Arthropods) and Environmental sampling under the same Event.
It is possible that specific latitude and longitude coordinates are not of interest in a Surveillance Activity. For example, a set of mosquito tramps deployed in parcels whose specific spatial location within each parcel is not relevant. In this case, the traps can be under a unique Event with an ID that links the traps at time t with the parcel where they were placed. In this case, the Event will still need spatial coordinates but they do not represent the actual position of the Collections, Sources, etc.
This flexibility allows WHeDB to accommodate diverse surveillance strategies, ensuring meaningful and context-appropriate data representation.
It is up to the user to define if the Event is a unit of interest and what it represents and to document this definition in the Surveillance Activity metadata. To ensure consistency, each Surveillance Activity must maintain a single, well-establish Event definition.
Key attributes of a Event include:
- Start Date
- Event ID
- Event Code
- Event Cross-Reference ID
- Event Cross-Reference Origin
- Longitude & Latitude
- Coordinate Reference System (CRS) used
If there are a Event attributes that are not part of the current data model but are of interest to track, they must be reported in the Surveillance Activity metadata. It is recommended to maintain such additional attributes in a separate system, such as another database or an Excel sheet, for reference. Common extra Event attributes can be added to the data model in the future. Missing options for single- and multi-selection attributes of Event can be added as long as they maintain a controlled vocabulary to ensure consistency and data integrity.
Collection
In the data model, a Collection represents the effort involved in observing, detecting, capturing, or otherwise identifying Sources at an Event, starting from the Event timestamp.
Examples of Collections include:
- The time spent by an observer at a fixed location to identify dead birds in a wetland, along with the tools used (e.g., telescope, binoculars).
- The number of camera traps deployed at a specific site to photograph a sick animal and the hours they were active.
- The distance and time traveled by a ranger to find a dead animal.
- The trap type, bait, deployment period, and number of traps used to collect mosquitoes at a given site.
- The distance traveled to obtain or observe Sources across a transect.
A Collection can contain between zero and an unlimited number of Source Records. For example, a capture effort that results in no animals caught.
An Event might not include any Collection. Examples include Sources found opportunistically (e.g., through news articles reporting wildlife health events) or survey and sampling of the animals in a wet market. In this case, the effort might not be a relevant unit for the Surveillance Activity (hours spent at the market) and can be ignored.
A Collection typically contains a specific type of Source Record. This is because observations, captures, and sampling efforts generally focus on a specific type of target, whether Group, Animal, Environmental, or Arthropods. However, a single Event can include multiple Collections targeting different objectives. For example, mosquito traps and an air filtration device can be linked to the same Event.
Collections are defined by:
- The number of units, that must be larger than zero.
- The type of spatial and temporal units associated with the completion of the Collection (e.g., “number of mist nets”, “number of CO₂ traps”, “number of Camera traps”, “distance walked”; and “area scanned”).
- The position of the units with respect to the spatial coordinates of the Event (e.g., “at the Event” “around the Event”, “starting at the Event”, “ending at the Event”, “starting and ending at the Event”).
- The position of the units with respect to the temporal coordinates of the Event (e.g., “starting at the Event”, “ending at the Event”).
For example, “1”, “observer standing”, and “at the Event” for number of spatial units, spatial unit, and spatial position with respect to the Event, respectively; and “6”, “hours observing with binoculars”, and “at the Event” for number of temporal units, temporal unit, and temporal position with respect to the Event, respectively. Similarly, the attributes of a Collection completed through a transect can be “2”, “hours”, and “starting at the Event” for the temporal components, and “4”, “kilometers walked”, and “starting and ending at the Event” for the spatial components.
Collections also allow to characterize problems during the search for Sources, such as camera traps running out of battery or stolen, torn mist nets, etc.
When Source Records are not associated to a Collection (e.g., opportunist finding of dead animals), the following properties are “number of units” = NA, “unit” = NA, and “positioning” with respect to the Event = NA
Key attributes of a Collection include:
- Collection ID
- Collection Code
- Collection Cross Reference ID
- Collection Cross Reference Origin
If there are a Collection attributes that are not part of the current data model but are of interest to track, they must be reported in the Surveillance Activity metadata. It is recommended to maintain such additional attributes in a separate system, such as another database or an Excel sheet, for reference. Common extra Collection attributes can be added to the data model in the future. Missing options for single- and multi-selection attributes of Collection can be added as long as they maintain a controlled vocabulary to ensure consistency and data integrity.
Source and Source Records
General Overview
In the data model, a Source is a unit that can be observed and provide Specimens for analysis or whose observation at time t is used in a diagnostic test (e.g., full Animal for an MRI). Sources can provide between zero up (e.g., observation only) to indefinite number of Specimens to conduct Diagnostic tests. The data model manages four types of Sources: i) a co-specific group of animals (Group), ii) individual animals (Animal), iii) sites that can provide abiotic tissue of interest or biotic material of animal origin whose individual or group animal of origin is unknown (Environmental), and iv) sites where arthropods are obtained from (Arthropod). A Group Source could correspond to a specific bat roost of species X sourcing guano, an Animal Source could be a collared animal, an Environmental Source could be a pond where water is collected from or feces found in the field whose animal of origin is unknown, and an Arthropod Source could be a site where mosquito larvae are collected from. Sources contain time-independent data only, such as the species of an animal in the case of a Group or Animal Sources (more below).
All Sources can potentially be identified and tracked over time if needed. However, there are limitations, Environmental and Arthropod Sources are site-specific, making them easier to track across different time points. Animal and Group Sources may not always be individually identifiable, which can prevent tracking them over time.
A Source collected, or captured at time t can be linked to Events through a Collection (see previous section) via a Source Record (the Source at time t), which represents the Source at a specific time t. Source Records contain only time-dependent data, such as health status at time t for Group and Animal Sources.
Individually identified Sources can tracked across multiple Events, even across different Surveillance Activities. For example, a bat (Source) captured in a mist net (Collection) deployed at a roost at time t (Event), individually identified, and later recaptured in another Event t′. This bat is linked to both the original capture Event and the recapture Event, with a unique Source Record for each occurrence.
Sources that are not identified will be linked as a Source Record to the Event where they were observed or captured only. For example, a bat (Source) that is captured (Collection) and sampled during an Event but not marked or a dead animal not identified found by a ranger during a patrol. The bat and the animal found are linked only to the Event they were capture and observed, respectively.
Sources found opportunistically are still linked to an Event, even if no formal Collection occurred. For example, a member of the public reports a dead animal (Source Record) found on a beach at time t (Event).
Group Source
A Group Source is a unit of co-specific individuals (animals of the same species) that are associated with a shared location or entity, such as a herd, area, site, farm, cage, stall, or enclosure, forming a single epidemiological unit. Group Sources can be observed, captured, and provide Specimens collectively at a given time t when the Event occurs.
Key attributes of a Group Source include:
- Species
- Group Source ID
- Group Source Cross Reference ID
- Collection Cross Reference Origin
The primary purpose of a Group Source is to record individuals at the species level rather than tracking each one separately. This approach is useful when herds are the unit of interest, when protected area rangers document multiple animals of the same species during a health event, or when animals in a market are grouped in cages or stalls without individual identification. For example, if a single cage contains animals from two different species and they are not tracked individually, they are recorded as two separate Group Sources—one per species.
Group Source Record
A Group Source Record documents the count of animals within a Group Source at time t, categorized by sex, age, and health status (e.g., healthy, injured, sick, or dead). Additional attributes include
- Observed anomalies
- Potential cause of disease
- Potential cause of death
Since properties of a Group Source Record are recorded at the group level, multiple options can be reported for these attributes. For example, if a Group Source Record includes three dead animals, several potential causes of death may be listed. However, it is not possible to determine how these potential causes or other properties are distributed across individuals within the Group Source Record—only that they were present in at least one individual.
A Group Source Record can include a mix of dead, diseased, poisoned, infected, injured, and healthy individuals of the same species, or it may consist solely of one category if all individuals share the same status (e.g., all are dead). Additionally, even a Group Source Record consisting only of healthy animals can be part of a Health Event, depending on the Event’s definition.
A Group Source Record may contain a single individual. For example, rangers patrolling a protected area might find one dead animal of species X and two dead animals of species Y at the same site (Event), documenting them as separate Group Source Records (one group per species). Similarly, if only a single animal from a tracked herd is observed at time t, that individual represents the Group Source Record for the herd.
An Event can include multiple *Group Source Records of the same species. For example, if a vendor at a market keeps animals of the same species in two separate cages, and the Event is a vendor, then each cage could be considered a distinct unit. Consequently, there would be two Group Source Records for the same species under the same vendor (i.e., within the same Event).
A Group Source can be directly used for a Diagnostic (e.g., assessing the body condition of a specific herd). The data model supports Diagnostics applied to an entire group rather than just a Specimen taken from that Group Source. Specimens from Group Sources can be stored, exported, and used for Diagnostics. However, the data model does not allow Group Source carcasses to be exported or stored collectively as a set of carcasses. This is because their collection and handling present an opportunity to gather data at the individual level and recorded as an Animal Source. A similar principle applies to Necropsies and the Specimens collected from carcasses. Carcasses of animals of a Group Source collected or taken for Necropsy must be converted into Animal Sources. These Animal Sources are recorded as former members of the Group Source (see the Complexities section for further details).
For similar reasons, Group Sources cannot be removed from the field. If dead animals from a Group Source are collected, each must be documented as an Animal Sources originating from the corresponding Group Source whose Carcass is taken (see Animal Sources and Carcass below).
The database does not accept live animals from Group Sources be taken ex-situ.
An animal of species X recorded in a Group Source Record must not be duplicated as an Animal Source Record, and vice versa. If an Event contains both a Group Source Record and an Animal Source Record of species X, the total number of animals of that species in the Event is the sum of those in the Group Source Record and the individual(s) recorded separately as Animal Source Record (e.g., the individuals in the group of animals of species X plus a carcass collected of the same species that would have belonged to the count of dead animals of the Group Source Record if not collected).
For example, consider a herd of 20 cows illegally raised in a protected area whose health is assessed at time t. If two cows are sampled, the two sampled cows should be documented as two separate Animal Source Record (one per cow), while the remaining 18 cows are recorded as a single Group Source Record, categorized by sex, age, and health status. In this scenario, the total number of cows is 20—the 18 recorded in the Group Source Record plus the 2 documented as Animal Source Records. The “herd” identity of these 20 cows split in three Source Records can be maintained using a cluster (see Complexities). If, however, the Group Source Record incorrectly includes all 20 cows while two of them are also recorded as Animal Source Record, the total count would incorrectly sum to 22, introducing duplication.
Animal Source
An Animal Source represents a single individual whose individual-level data is relevant. Animal Sources can be observed, captured, tested, and provide Specimens at time t, including full Carcasses for Necropsy.
Key attributes of a Animal Source include:
- Species
- Animal Source ID
- Animal Source Cross Reference ID
- Sex
Any past marking codes assigned to the animal are considered immutable and are recorded as part of the Animal Source data.
For example, in a live animal market, if data is collected at the individual level, each animal—whether of the same or different species—must be recorded as a separate Animal Source. In contrast, if animals are documented collectively (e.g., by species in a shared cage), they would be classified under a Group Source rather than as individual Animal Sources.
Animal Source Record
An Animal Source Record represents an individual Animal Source at time t.
Key attributes of a Animal Source include:
- Species
- Age
- Health Status
- Observed Anomalies
- Potential Causes of Disease or Death
The current marking code of an individual animal, if any, at time t is mutable and recorded as a property of the corresponding Animal Source Record.
Some properties of an Animal Source Record allow single or multiple selections. For instance, an animal’s health status is recorded as one category (e.g., “live healthy” or “live sick”), whereas multiple anomalies may be reported for the same individual at time t (e.g., wounds, hair loss, diarrhea).
An Animal Source Record can be categorized as dead, diseased, injured, or healthy at time t. A live, healthy Animal Source Record can still be part of an Event, depending on the Event’s definition (e.g., a healthy animal of species X observed near dead animals of species X, Y, and Z, or live animals captured for Specimen collection).
An Animal Source Record can be used directly for a Diagnostic (e.g., performing X-rays on a live animal at time t). The data model supports Diagnostics applied to the individual itself, rather than only to Specimens collected from the animal.
Live Animal Sources cannot be removed from the field (ex-situ). Only Carcasses of Animal Sources can be transported and stored in a facility.
An individual of species X recorded as part of a Group Source Record cannot also be included as an Animal Source Record, and vice versa. If an Event contains both a Group Source Record of species X and an Animal Source Record of species X, then the total number of animals of species X at the Event is the sum of the individuals recorded in the Group Source Record and the single individual recorded as an Animal Source Record.
For example, consider a herd of 20 cows illegally raised in a protected area, where their health is assessed at time t. If two cows are sampled individually, they are recorded as two separate Animal Source Record (one per sampled cow). The remaining 18 cows are documented as one Group Source Record, where the total number of individuals is categorized by sex, age, and health status. Thus, the total number of cows at the Event is 20 (18 from the Group Source Record + 2 from the two Group Source Records). To maintain the herd identity, a cluster can be used (see Complexities). If the Group Source Record initially included all 20 cows and two additional Animal Source Record were created for the sampled individuals, the total number of cows would incorrectly appear as 22—which must be avoided.
Vaccination
The data model allows the inclusion of vaccinations administered to an Animal Source at the time of capture or immobilization t. Multiple vaccination records can be added to an Animal Source Record as needed, ensuring that vaccination history accumulates for the corresponding Animal Source.
Carcass
Each Carcass recorded in the database originates from a single Animal Source and is collected at a specific time t. Therefore, a Carcass is always linked to one, and only one, Animal Source Record.
Key attributes of a Carcass include:
- Decomposition Condition
- Storage
- Storage During Transport
- Owner
- Availability
Carcasses may undergo multiple storage changes over time, including shipments between storage facilities. Any storage updates must occur only after the corresponding exportation is finalized, ensuring the Carcass is properly stored at its destination.
Carcasses do not provide Specimens directly; rather, Specimens are collected through the associated dead Animal Source Record. Similarly, a Carcass is not directly used for Diagnostics—any diagnostic procedures that involve a full carcass, such as X-rays, applied to the corresponding dead Animal Source Record and documented accordingly.
Necropsy
A Necropsy is performed on a specific Animal Source Carcass. A Necropsy can take place is either at the time the dead animal is found and a field necropsy is conducted (without Carcass collection) or after the Carcass has been collected.
Key attributes of a Carcass include:
- Necropsy Identifier
- Necropsy Cross Reference Identifier
- Necropsy Date
- Findings per system
- Availability
Necropsies can be classified as primary or secondary. A primary necropsy typically begins with an intact Carcass that has not been previously examined. In contrast, a secondary necropsy is usually performed by a veterinary pathologist using either photographs from the primary Necropsy or conducting the Necropsy again.
A Necropsy does not provide Specimens directly; rather, Specimens are provided through the dead Animal Source Carcass.
Environmental Source
An Environmental Source is a unit in space where Specimens that cannot be associated with a Group Source, an Animal Source, or Arthropod Source can be collected from (e.g., the site where feces of unknown source are found). Properties of an Environmental Source include the Environmental Source ID, the Cross Reference ID, among others (see Data Dictionary).
Key attributes of a Environmental Source include:
- Environmental Source ID
- Environmental Source Cross Reference ID
Environmental Source Record
An Environmental Source Record represents the biotic or abiotic material collected from an Environmental Source at time t using a specific Collection method. For example, water sampled from a pond (Event) at site X (Environmental Source) using a particular device (Collection) at time t constitutes an Environmental Source Record.
Key attributes of a Environmental Source Record include:
- Environmental Source Record ID
- Record Number
- Type of Tissue
- Quantity
- Quantity Unit
A key distinction between Environmental Sources and Group or Animal Sources is that multiple Records can be collected from the same Environmental Source within a single Event (e.g., multiple Collections from the same Environmental Source at time t). In contrast, an Event can include only one Record per Group or Animal Source. An Event can contain a single Environmental Source, but multiple Records may be retrieved (one per Collection). Conversely, multiple bats captured in a mist net (multiple Sources) can only contribute one Record per Event (see figure at the end of this section).
Another key distinction is that ‘species’ is a property of the Source for Group and Animal Sources, whereas for Environmental Sources, it is a property of the Record. For example, feces found in the field may be identified at the taxonomic level of ‘mammal.’ In this case, the species property of the Environmental Source Record can be completed accordingly.
If a tissue Collection attempt from an Environmental Source at time t fails, the Collection itself is still recorded, but no Environmental Source Record is created. However, an Environmental Source Record cannot exist without an associated successful Collection
Arthropod Source
An Arthropod Source is a unit in space where arthropods can be taken from (e.g., a household in the forest where traps can be set). Properties of an Arthropod Source include the Source ID, the Cross Reference ID, the Cross Reference ID Origin, among others (see Data Dictionary).
If the interest of the Surveillance Activity is at the arthropod individual level (e.g., butterflies with problems in their wings or with parasites), then the user should consider these arthropods as Animal Sources. Arthropods from Animals Sources (attached ticks, lice, fleas, mites) are Specimens (see next section) from an Animal Source and not Arthropod Sources.
An Arthropod Source is a defined spatial unit from which arthropods can be collected (e.g., a household where traps are set).
Key attributes of a Arthropod Source include:
- Arthropod Source ID
- Arthropod Cross Reference ID
If the focus of the Surveillance Activity is on individual arthropods (e.g., butterflies with wing deformities or parasitic infestations), they should be considered Animal Sources.
Arthropods found on Animal Sources (such as attached ticks, lice, fleas, or mites) are considered Specimens collected from the Animal Source rather than independent Arthropod Sources.
Arthropod Source Record
An Arthropod Source Record represents the arthropods collected from an Arthropod Source at time t using a specific Collection method. For example, mosquitoes at time t (Arthropod Source Record) from a household in the forest (Arthropod Source) using a CO₂ trap (Collection). This means that mosquitoes obtained using CO₂ traps (Collection) at an Arthropod Source at time t constitute one Arthropod Source Record, while mosquitoes collected using BG traps at the same Event and is a separate Arthropod Source Record.
Key attributes of a Arthropod Source include:
- Arthropod Source Record Number
- Arthropod Species
- Number by age, sex, and condition (females only)
Like Environmental Sources, a key distinction between Arthropod Sources and Group or Animal Sources is that an Event can contain multiple Records from a single Arthropod Source (many Collections at an Arthropod Source at the same time and place), whereas an Event can only contain a single Record per Group or Animal Source. An Event can have multiple Arthropod Source Records (one per Collection). In contrast, if multiple bats are captured in a mist net at time t (many Sources Records), they each contribute only one Record per Event (see Figure at the end of the section).
Another key difference is that “species” is a property of the Source for Group and Animal Sources, while for Arthropod Sources, several species can be part of the Record and they are assigned at the Source Record level. For example, mosquitoes collected using CO₂ traps (Collection) at a specific site (Source) will be identified and counted by species after Collection, forming an Arthropod Source Record.
If an arthropod Collection attempt at time t fails, the Collection itself is recorded, but no Arthropod Source Record is created. An Arthropod Source Record cannot be empty.
Specimen
Generalities
In the data model, Specimens refer to tissues or materials collected to conduct Diagnostics (see Diagnostics).
Specimens originate from the following:
Sampled Source Records (i.e., an oral swab from an individual animal; the Specimens within the green boxes in the figure below). These Specimens may consist of either a single type of tissue (e.g., blood) or multiple tissue types (e.g., blood and saliva) from the same Source Record.
A Diagnostic Product created by running a Diagnostic on a Specimen that can be used in further Diagnostics (i.e., cDNA created from RNA in a sample; the Specimens within the pink boxes in the figure below)
Other Specimens (Pooled Specimen). For example, different Specimens from the same or multiple Source Records are mixed (grey boxes in figure below) with Diagnostic Products
Key attributes of a Specimen include:
- Specimen ID
- Tissue Type
- Specimen Original Amount
- Specimen Current Quantity Stored
- Origin
- Ownership
Specimens are stored, and their storage may change multiple times over time as they are transferred within and between facilities or because the amount of Specimen changes as it is used in a Diagnostic. Similarly, Specimens can be exported multiple times. Any storage changes related to an exportation must occur only after the exportation is completed, ensuring the Specimen is properly stored at the destination facility.
Specimens are stored and changes in the storage or movement of the Specimens from one storage facility to another can be multiple over time. Similarly, Specimens can be exported multiple times. Changes in storage associated with an exportation must occur after the exportation is completed and the Specimen can be stored in the destination facility.
Specimens without any amount of tissue left remain in the data model so the last storage, exportation, and use in Diagnostics can be traced.
Specimens of Group Source Records
An example of Group Source Specimen are feces collected from the bottom of a cage with animals of the same species but it is unknown which of the animals dropped the feces.
A Group Source used directly for a Diagnostic (e.g., assessment of fat in carcasses of animals belonging to a specific herd) is not a Specimen (see Sources And Source Records).
Group Source Specimens cannot be OBTAINED after the last date the Group Source was observed (the last Record of the Group Source). For example, it is possible to collect feces from a cage that restrains a Group Source longitudinally and also from the same cage after the animals of the Group Source were moved. In this case, the Specimen from the empty cage does not belong to the Group Source but to an Environmental Source. The sampling of the feces from the empty cage belongs to a different Event. To keep the connection between the corresponding Specimens, it is possible to cluster the Group Source Records and the Environmental Source Record (see Complexities).
Specimens of Animal Source Records
An example of Animal Source Specimen is 2 ml of blood taken from a lion.
An Animal Source used directly for a Diagnostic (e.g., ultrasound in an animal) is not a Specimen (see Sources And Source Records).
Animal Source Specimens cannot be OBTAINED after the last date the Animal Source was observed (the last Record of the Animal Source). For example, it is possible to collect feces from a cage that restrained an Animal Source longitudinally and also from the same cage after the animal was moved. In this case, the Specimen from the empty cage does not belong to the Animal Source but to an Environmental Source. The sampling of the feces from the empty cage belongs to a different Event. To keep the connection between the corresponding Specimens, it is possible to cluster the Animal Source Records and the Environmental Source Record (see Complexities).
The data model DOES accept Animal Source Specimens that are CREATED after the last date the Animal Source was documented (the last Record of the Animal Source). New Specimens can be generated during a Necropsy or from a stored Carcass. In this case, the date of Specimen creation is not necessarily the date when the Animal Source was found, when the Carcass was collected, or when the animal died, but after the storage of the Carcass or the date of the Necropsy (primary or secondary). It is possible to track if a Specimen was collected in the field (from the animal, carcass, during a field necropsy, or the ground near the animal), or in a facility from the carcass or during a primary or secondary Necropsy based on the information entered for Specimens.
Specimens from Environmental Source Records
An example of Environmental Source Specimen is 10 grams of soil collected from a specific site (Environmental Source). Another useful example are feces of unknown origin found in the field at time t (Source Record) at a specific site (Source).
In this second example it is possible to: i) collect the full feces or ii) take swabs from it. In the former case, the feces are the Source Record and the Specimen (the Specimen can be stored. The Source Record cannot be Stored). In the latter, the feces are just the Source Record whilst the swabs are the Specimens.
In the case of Environmental Source Records, their type of tissue is determined by the type of the Specimen Source Record. For example, water collected at time t (Environmental Source Record) from a pond (Environmental Source) can only yield tissue of type “water”. If a Specimen contains tissue from a single Environmental Source Record, then, the type of tissue of the Specimen must be the same type of tissue of the Environmental Source.
The data model DOES accept Environmental Source Specimens that were CREATED after the last date the Environmental Source was visited (the last Record of the Environmental Source). New Specimens can be generated from a contained Environmental Source Record (for example, a bottle with sediment that is divided in Specimens later). In this case, the date of Specimen creation is not necessarily the date when the Environmental Source Record was obtained but later.
Specimens from Arthropod Source Records
An example of Arthropod Source Specimen is the set mosquitoes of the same species coming from the same Collection. These are basically sub-groups of the mosquitoes present in the corresponding Source Record. In vector-borne disease surveillance, this last type of Specimen is usually called “pools”.
In the case of Arthropod Specimens, their type of tissue is determined by the Arthropod Source Record. If a Specimen contains tissue from a single Arthropod Source Record, then, the type of tissue of the Specimen is “arthropod” by default. In this case, the links of the Specimen with its origin allows to track the species of arthropod involved.
The data model DOES accept Arthropod Source Specimens that were CREATED after the last date the Arthropod Source was visited (the last Record of the Environmental Source). New Specimens can be generated from an Arthropod Source Record (for example, taking a new sub-set of mosquitoes from the Source Record and creating a new Specimen with mosquitoes of the species). In this case, the date of Specimen creation is not necessarily the date when the Arthropod Source Record was obtained but later.
Specimens with a mix of tissues coming from a unique or several Source Records
A bat is swabbed in its oral cavity and in its rectum but then the swabs are placed together in a tube and considered a single Specimen. Then, there is a single Specimen with a unique Source Record origin and with two types of tissue: “rectal swab” and “oral swab” in a tube.
It is possible also to get blood from two bats and mix it. The data model can accommodate this case because Specimens can have multiple Source Record origins. In the example, both bled bats (Source Records) are the origin of a single Specimen with “blood” tissue consisting in mm of blood.
Similarly, it is possible to generate a Specimen by mixing tissue of different type from different Source Records. For example, the blood (tissue) of a bat captured at time t (Animal Source Record) with the feces (tissue) collected from the bottom of the roost of the same bat at time t (Group Source Record). In this case, the mixed tissue (blood from an Animal Source and Feces from a Group Source) can be included in the data model as a single Specimen with two Source Records as origin and two types of tissue.
The key here is that Specimens are generated by mixing tissue coming from different Source Records, which is different than mixing Specimens. In the data model, mixing Specimens means mixing data Units already documented (see next).
Specimens from Diagnostic Products
Products generated by a Diagnostic method (see Diagnostic Products as Specimens below) can be stored and used as Specimen in further Diagnostics (e.g., use cDNA created as part of Diagnostic A used in a new RT-PCR, Diagnostic B). The data model can accommodate Diagnostic Products to be used as Specimens in future Diagnostics. In this case, the origin of these new “Specimens” are specific Diagnostic and not Source Records, and their type is a diagnostic product such as cDNA. The remaining properties of a Specimen from a Diagnostic Product are the same as for Specimens from Source Records (see Data Dictionary).
Diagnostic Products must be added to the data model as a Specimen, so this Specimen can be pooled with other Specimens.
Pooled Specimens
Specimens can be created by mixing other Specimens of any origin. For example, mixing a Specimen from an Arthropod Source, a Specimen from an Animal Source, a Specimen from an Environmental Source, and a Specimen from an Arthropod Source (see below). The origin of Pooled Specimens is tracked in the data model. In the example, the origin of the Specimen is four Specimens and the new Specimen has potentially four types of tissue. The remaining properties are the same as for Specimens from Source Records (see Data Dictionary).
Specimens in Containers
In situations where space or materials are limited, it is possible that multiple Specimens are stored in a unique container. This approach is clearly not ideal because it can lead to cross-contamination and make actual Specimen tracing more complex. However, the data model has Container Identifier as a property of Specimens (see Data Dictionary). Specific properties for each individual Specimen within the container, such as type, quantity, etc, should allow their visual identification within the container.
Diagnostics
Diagnostics in the data model encompass various techniques used to identify hazards (biological, chemical, physical) or physiological problems in either Records of Animal and Group Sources or Specimens obtained from Source Records of any type. Diagnostics conducted directly in Group and Animals Sources can include general body condition (Group or Animal Source), the width of body fat (Group or Animal Source), an ultrasound (Animal Source). Diagnostics can range from advanced techniques such as metagenomics to basic observations of body condition. Whatever the Diagnostic methods used are, they must be reported in the metadata of the Surveillance Activity.
In the data model, Diagnostics can be conducted in a Laboratory or similar (see Laboratory below) but it is also possible to include field-based assays. Diagnostics can also be conducted by an individual (ranger, hunter, researcher, biologist, etc.) when they involve external or simple observations such as the body condition of an animal to assess a nutrition related hazard.
In the data model, it is considered that each Diagnostic is designed for a specific targeted hazard, such as a viral family or a particular virus species and, therefore, each Diagnostic - Hazard is a Diagnostic method.
Diagnostic properties include the type, method, and result. The type of Diagnostic refers to a general category of diagnostic techniques, such as histopathology, serology, imaging, molecular, clinical. Within each type, there are specific methods available, such as biopsy, ELISA assays, X-rays, PCR, or the width of fat in a herd, respectively. In the data model, each Diagnostic has a unit of measurement that can either be qualitative or quantitative. The result of the diagnostic is reported as a value of those units. For example, an agglutination test can report the minimum dilution that causes observable agglutination. A PCR can be reported as presence of bands compatible with the targeted genome sequence or genetic sequence compatible with targeted organism if sequencing followed the genetic sequence amplification. The interpretation of a Diagnostic (positive, negative, undetermined) corresponds to an “Interpretation” of the Diagnostic (see Diagnostic Interpretation). Other properties of a Diagnostic include the Diagnostic ID, Diagnostic Code, Diagnostic Cross Reference ID, Diagnostic Cross Reference ID Origin, among others (see Data Dictionary).
Diagnostic Products as Specimens
In the data model, Diagnostics can produce “Diagnostic Products” that are then considered a Specimen. Diagnostic products as Specimens can be traced with respect to their storage, the quantities available, and any exportation process (See Storage and Export below). Furthermore, Specimens from Diagnostics Products can be used as in other Diagnostics (see Specimens above). For example, cDNA used in a new PCR essay. Basically, the data model accepts a new Specimen generated from a Diagnostic Product with the corresponding type and the origin of ths Specimen is a Diagnostic instead of one or multiple Source Records or other Specimens (See Specimens above).
Complexities
Storage
In the data model, Specimens from all Sources (including Diagnostic Products) and Carcasses of Animal Sources can be stored, their storage can be tracked, and changes in storage can be traced.
Export
In the data model, Specimens from all Sources (including Diagnostic Products) and Carcasses of Animal Sources can be exported and exportations can be tracked.
Laboratories
The data model includes Laboratory entities. Laboratories can conduct Diagnostics to test for hazards in Specimens or in full Source Records (Field-based Diagnostics or simple Diagnostics conducted by people are also possible in the data model. See Diagnostics). Laboratory properties include address, manager, name, Laboratory ID, among others (Data Dictionary). It is also possible to store data regarding Laboratory capabilities in terms of diagnostic tests and storage, and their certifications (Bio safety levels, etc.)
Interpretation
Interpretation of a Hazard’s detection in a Diagnostic
In the data model, Interpretation of a Diagnostic is the conclusion regarding the detection of a hazard using a Diagnostic targeting that specific hazard, based on a case definition for a positive, negative, or undetermined result provided in the Surveillance Activity metadata. Each Diagnostic receives one Interpretation only.
Interpretation of a Hazard’s presence in a Specimen
In the data model, Interpretation of a Hazard in a Specimen is the conclusion regarding the presence or absence of a specific hazard in a Specimen, based on a case definition for a positive, negative, or undetermined Specimen provided in the Surveillance Activity metadata. The interpretation follows the results from the Diagnostic(s) conducted using the Specimen and the corresponding Diagnostic(s) Interpretation(s). Specimens can receive multiple Interpretations if they are used to conduct several Diagnostics to assess several hazards or if the same hazard is assessed more than once in different diagnostics.
Interpretation of a Hazard’s presence in a Source Record
In the data model, Interpretation of a Hazard in a Source Record is the conclusion regarding the presence or absence of a specific hazard in a Source Record, based on a case definition for a positive, negative, or undetermined Source Record provided in the Surveillance Activity metadata. The interpretation follows the results of the Diagnostic(s) conducted using the Source Record, the Source Record Specimen(s), the corresponding Diagnostic(s), anomalies in the Source Record at a Event, and Necropsy findings.
Source Records can receive multiple Interpretations if multiple hazards are assessed in them or the same hazard is assessed more than once. Sources can have many Interpretations through several Source Records if multiple hazards are assessed in them at time t (Source Record), if the same hazard is assessed more than once at time t (Source Record), or if different or the same hazard are assessed over time (several Source Records of the same Source over time).
Clustering of Locations, Events, and Source Records
Clusters in Surveillance Activities
The units Location, Event, and Source Record could be enough to record the data structure of a specific Surveillance Activity. For example, rangers collecting information during patrols (Field Visits) in Protected Areas (Locations) at specific points (Events) where dead animals (Source Records) are found at any given time. But another Surveillance Activity could have the following data structure:
- Visit (Field Visit)
- Protected area (Spatial cluster level 1),
- Zones within protected area (Location),
- Grid cells within each zone (Spatial cluster level 2),
- Capture site (Event)
- Mist nets at the capture site (Collection),
- Bats captured (Source Records)
- Mist nets at the capture site (Collection),
- Capture site (Event)
- Grid cells within each zone (Spatial cluster level 2),
- Zones within protected area (Location),
- Protected area (Spatial cluster level 1),
Another example is surveillance of wild animals in a live markets. One of the potential options to structure these data is:
- Visit (Field Visit)
- City (Spatial cluster level 1)
- Neighborhood (Spatial cluster level 2)
- Market (Location)
- Vendor (Spatial cluster level 3)
- Stalls (Spatial cluster level 4)
- Cage (Event)
- Animals (Source Records)
- Rectal Swab (Specimens)
- Animals (Source Records)
- Cage (Event)
- Stalls (Spatial cluster level 4)
- Vendor (Spatial cluster level 3)
- Market (Location)
- Neighborhood (Spatial cluster level 2)
- City (Spatial cluster level 1)
Looking at the examples above, it is straightforward to understand that the units Location, Event, and Collection might not be enough to accommodate the data of all Surveillance Activities and other units (“Clusters”) can be needed.
A second layer of complexity is the potential need for non-nested Clusters. The last example above corresponds to a series of nested Clusters where the stalls are within vendors and neighborhoods are within cities. But it is also possible to need non-nested spatial Clusters, for example, “Zip Code”. Zip Code A can include portions of cities 1 and 2, and Zip Code B can also include portions of cities 1, 2, and 3.
And a third layer of complexity is the potential need for spatial and temporal clusters. For example, it is possible that the data structure of the live market Surveillance Activity presented above has to be categorized in decades, year, season, season-year, etc. The following example adds the categorization of the data by season, no matter the markets visited or how many years the Surveillance Activity lasts:
- Season (Temporal cluster 1)
- Visit (Field Visit)
- City (Spatial cluster level 1)
- Neighborhood (Spatial cluster level 2)
- Market (Location)
- Vendor (Spatial cluster level 3)
- Stalls (Spatial cluster level 4)
- Cage (Event)
- Animals (Source Records)
- Rectal Swab (Specimens)
- Animals (Source Records)
- Cage (Event)
- Stalls (Spatial cluster level 4)
- Vendor (Spatial cluster level 3)
- Market (Location)
- Neighborhood (Spatial cluster level 2)
- City (Spatial cluster level 1)
The spatial clusters can be nested, non-nested, or a combination of both. Similarly, the temporal cluster can be nested, non-nested, or a combination of both.
Finally, the number of categories within each Cluster can also be different across Surveillance Activities. For example, two Surveillance Activities may include grid cells as one of their Clusters, however, the number of grid cells can be different. For example, one Surveillance Activity can include grid cells A to R, whilst the other Surveillance Activity can include grid cells A to W (more categories). In summary:
- Clusters units can be needed
- The Clusters across Surveillance Activities can be different
- What these Clusters are grouping across Surveillance Activities can be different
- The number of categories can be different among Clusters, within and between Surveillance Activities
- Nested, Non-nested or both types of Clusters can be needed
- Spatial, Temporal or both Clusters types of Clusters can be needed
Clusters in the Data Model
To accommodate these needs in data structure, the data model allows the inclusion of Clusters between the Source Record and Event, Event and Location, and Location and Field Visit. The data model also allows the inclusion of an undetermined number of nested and non-nested Clusters at each of these levels, and the inclusion of spatial and temporal Clusters. Collections are always nested under and only under an Event
Unavoidably, the number of Clusters, what Clusters represent, what they cluster, the number of categories per Cluster, and the data to be collected from each of these extra units will vary among Surveillance Activities. Therefore, Cluster properties must be reported in the Surveillance Activity metadata. In the data model, the only default property for each Cluster are the identifier, the cross identifier, the origin of the cross identifier, and a description. Other potential properties must be documented in a separate file (e.g., an excel sheet) with common identifiers to allow joining the clusters with the corresponding data.
A Cluster unit can group Source Records of different type. An interesting case study is the collection of Specimens from reindeer carcasses when they are found dead and also from the soil underneath each carcass. In this Surveillance Activity, an Event is a site where dead animals (at least one) are found. So, if two or more carcasses are found at the same site, then the Event contains two Animal Sources, but also two Environmental Sources providing soil Specimens. However, the interest of the researchers is to maintain the connection between the soil Specimens collected below each Animal Source and the Specimens from each Animal Source. Basically, they want to keep track of the soil Specimen that was below each Animal Source. In this case, a Cluster unit can be used to group the corresponding Soil Specimen and Animal Specimen in two pairs while keeping the four specimens under the same Event.
Surveillance Activities
Initially, it was mentioned that “Field Visits, Locations, Events, Sources, Source Records, and Diagnostics usually belong to a single Surveillance Activity. This is the Surveillance Activity that lead to the Field Visits at different Locations to document Events, collect Sources and Specimens, perform Diagnostics for a specific hazard, and provide Diagnosis for Diagnostics, Specimens, and Sources”.
Furthermore, it was established that “a Surveillance Activity usually includes Field Visits, Locations, Events, Sources, Source Records, and Diagnostics”.
Here it is explained what are the exceptions to these two general statements and how the data model can handle them.
Field Visits up to Specimen belong to two or more Surveillance Activities
Two or more Surveillance Activities used the exact same methodology except for the Diagnostics (they target different hazards but the diagnostics are conducted using the same specimens). For example, the same rodents are trapped and sampled but they are tested for coronaviruses as part of Surveillance Activity 1 and Rickettsia sp as part of Surveillance Activity 2. In this case the Field Visits, Locations, Events, Collections, Source Records, and Specimens, belong to both Surveillance Activities. Instead, the Diagnostics and Diagnosis belong to one of them only.
Field Visits up to Source Records belong to two or more Surveillance Activities
This can happen when two or more Surveillance Activities overlap in their Field Visits up to Source Records but the Specimens up to Diagnostics are different. For example, the same rodents are trapped and sampled during the same Collection but Surveillance Activity 1 uses rectal swabs to test for coronaviruses whilst Surveillance Activity 2 uses blood for Rickettsia sp. In this case the Field Visits, Locations, Events, Collections, and Source Records belong to both Surveillance Activities. Instead, the Specimens, Diagnostics, and Diagnosis belong to one of them only.
Field Visits up to Collections belong to two or more Surveillance Activities
This can happen when two or more Surveillance Activities overlap in their Field Visits up to Collections but the Source Records up to Diagnostics are different. For example, bats are collected using the same mist nets; however, only Megabats are tested for Ebola as part of Surveillance Activity 1. Instead, only Microbats are tested for coronaviruses as part of Surveillance Activity 2. In this case the Field Visits, Locations, Events, and Collections belong to both Surveillance Activities. The Source Records, Specimens, Diagnostics, and Diagnosis belong to one of them only.
Field Visits up to Events belong to two or more Surveillance Activities
This can happen when two or more Surveillance Activities overlap in their Field Visits up to Events, but the Collection up to Diagnostics are different because they have different goals. For example, water is collected from Event A as part of Surveillance Activity 1 and a mosquitos are also collected from Event A as part of Surveillance Activity 2. In this case the Field Visits, Locations, and Events belong to both Surveillance Activities. Instead, the Collections, Source Records, Specimens, Diagnostics, and Diagnosis belong to one of them only.
Field Visits and to Locations belong to two or more Surveillance Activities
This can happen when two or more Surveillance Activities overlap in their Field Visits and Locations but the Events up to Diagnostics are different. For example, Surveillance Activity 1 aims to capture bats using mist nets and Surveillance Activity 1 aims to capture rodents. The Field Visit and the Locations could be the same but the capturing of the bats can occur at night and the capture of rodents can occur early morning. In this case the Field Visits and Locations belong to both Surveillance Activities. Instead, the Events, Collections, Source Records, Specimens, Diagnostics, and Diagnosis belong to one of them only.
Field Visits belong to two or more Surveillance Activities
This can happen when two or more Surveillance Activities overlap in their Field Visits but the Locations up to Diagnostics are different. Field Visit can be common because it facilitates logistics. For example, coming back from the forest after collecting chimpanzee urine as part of Surveillance Activity 1 and on the way back, a visit is made to a Rehabilitation Center to sample animals as part of Surveillance Activity 2.
Surveillance Activity contains only Specimens and Diagnostics
This can happen when Surveillance Activity 2 uses Specimens collected as part of Surveillance Activity 1 to test them again for the same or different hazard. For example, Specimens from bats collected ten years ago will be used in a new study to test them for SARS-CoV-2. The Specimen belong to both Surveillance Activities. The Diagnostics are the only new entity generated as part of Surveillance Activity 2, and it is clear that Specimens from Surveillance Activity 1 were used in Surveillance Activity 2.
Surveillance Activity contains only Source Records, Specimens, Diagnostics, and Diagnosis
This can happen when Surveillance Activity 2 generates new Specimens from carcasses of Source Records collected as part of Surveillance Activity 1 and tests these new Specimens for a hazard. For example, carcasses of bats collected ten years ago will be used in a new study, Surveillance Activity 2, to get new Specimens and test them for SARS-CoV-2. The Source Records belong to both Surveillance Activities. The new Specimens and Diagnostics are the only new entities generated as part of Surveillance Activity 2, and it is clear that Source Records (the carcasses in this particular case) from Surveillance Activity 1 were used in Surveillance Activity 2.
Same Source in different Surveillance Activities
The health of the same animal could be assessed under different Surveillance Activities. For example, a ranger finds a sick animal in a specific date and assess it health and potential cause of disease as part of Surveillance Activity 1. Then, the animal is taken to a rehabilitation center where its health is assessed again at admission and samples are collected for testing as part of Surveillance Activity 2. Finally, the animal health at the rehabilitation center is assessed again three months later and samples are collected again in order to learn what pathogens are circulating within the facility as part of Surveillance Activity 3.
In this case there are three Projects, Surveillance Activities, Field Visits, Events, and Collections independently of each other. There are two Locations. The Animal Source belongs to three Surveillance Activities. The Source Records and Specimens belong to a single Surveillance Activity just like the Diagnostics completed in Surveillance in Surveillance Activity 2 and 3.
Therefore, Surveillance Activities can be interconnected through common units or the reuse of units generated by previous Surveillance Activities.
Sources in Surveillance Activity but not Present in any Event
Studies that track marked individuals over time could succeed in capturing all individually identified animals, so all these Sources are included as Source Records at least once during the study period. However, it is also possible that only a proportion of the individualized Sources are captured or sight during the Surveillance Activity, so these Sources are not present as Source Record of any Event. For the second possibility, it is still important to track the Sources that were part of the sampling frame and censoring them when they were nor part of any Event.
For this reason, the data model can assigned a Surveillance Activity to a Source without including any Field Visits, Location, Event, or Collection. When the study is completed, it will be possible to identify all the Sources involved in the study including those that were never captured (Source 3 in the figure below).
Outbreak Investigation
An outbreak can contains elements associated with two or more Surveillance Activities. For example, the first detection of the outbreak could have been done by a citizen or by rangers patrolling a protected area. The Field Activity, Location, Event, Collection (if any), Source Records, and Specimens collected when the Outbreak is discovered belong to the “Citizen Science” or “Ranger Patrol” Surveillance Activity. However, they also belong to the “Outbreak Investigation” Surveillance Activity, together with Field Activities up to the Diagnostics associated exclusively with the Outbreak Investigation.
From Group Source to Animal Source
The data model can handle the conversion of an individual in a Group Source into an Animal Source. Animal Sources have a property to identify them as previous members of a Group Source.
The decision to convert an individual of a Group Source into an Animal Source could occur because: i) it was decided to start recording animals of a Group Source at the individual level including their Specimens and Diagnostics, ii) field Necropsies are conducted with the Carcasses of animals that belong to a Group Source (Necropsy data in the data model are linked to individual animals only), iii) all or some Carcasses of a Group Source are collected and stored or exported (Carcass data in the data model are linked to individual animals only), and iv) Necropsies are conducted with the Carcasses of animals belonging to a Group Source in a facility and again, Necropsy data in the data model are linked to individual animals only. The data model accepts Diagnostics conducted directly on a Group Source or Diagnostics conducted using Group Source Specimens, such as body condition in the overall Groups or testing feces taken from a cage occupied by 2 animals of the same species. Therefore, in these cases, it is not necessary to convert individuals of a Group Source to Animal Sources.
In the first case, all or some of the animals of a Group Source could start being recorded at the individual level. If only some of the animals of a Group Source are recorded at the individual level, then it is possible to have an Event at time t with Group Source Records and Animal Source Records of animals that belonged to the Group Source at time t’ that are considered Animal Sources at time t. The original Group Source does not include the animals that are converted to Animal Sources since time t. Therefore, the count of individuals in the Group Source Record must not include the new Animal Sources. It is also possible that starting at time t in a new Field Activity, all the animals of a Group Source are recorded as Animal Sources. In this case there are no more records of the Group Source since time t.
In the second case, a Necropsy conducted with the Carcass of an animal of a Group Source leads to the same process explained in the previous paragraph. The animal must be documented at the individual level because, in the data model, Necropsy data are tied to Animal Sources only.
Lastly, the collection of a Carcass of an animal of a Group Source and potential subsequent Necropsies also leads to the same process. The animal of a Group Source must be documented at the individual level because, in the data model, Carcass and Necropsy data are tied to Animal Sources only.
It is important to highlight that: conversion from an individual of a Group Source to Animal Source is complete and irreversible. This means that new Animal Source Records cannot be included in the count of animals in a Group Source, Animal Sources cannot be converted back to be part of the Group Source, and Specimens of the new Animal Sources are not Specimens of the original Group Source. Secondly, when it is decided that animals in a Group Source will be documented as Animal Sources there is a change of methodology and, most likely, the new Animal Sources should receive a new and single Surveillance Activity. It is possible to keep a connection between these Surveillance Activities in the data model (between the Group Source only one and the Groups Source - Animal Sources one). Animal sources have a property to link them to their original Group Source also. It is possible that the original Surveillance Activity methodology included an eventual change in the recording of animals in a Group Source as Animal Sources and, consequently, only a common unique Surveillance Activity is enough.
All Units in the Data Model and All Relationships
Considering the details explained in the Complexities section, a summary of the relationships in the data model is provided in the following figure (the orange Units and connections apply to Animal Sources only. The green line apply to Group and Animal Source Records only):