Data-Trust: Trustworthiness Metadata for News Reports

Data-Trust is a metadata schema that allows news reports to make assertions about their trustworthiness.

The schema defines a number of attributes related to trustworthiness and proposes a method of implementation using HTML5's custom "data" attributes.

Data-Trust attributes

data-trust-interview-methods

Description:A comma-separated list of methods the reporter used to conduct interviews
Values:
audioAn audio interview using a phone or similar technology
videoA video interview using videoconferencing or similar technology
writtenA written interview using e-mail, instant messaging, or similar technology
liveA face-to-face interview

data-trust-interview-translated

Description:Indicates whether a translator or translation service was used to conduct an interview
Values:
trueA translator or translation service was used
falseA translator or translation service was not used

data-trust-original-source

Description:Indicates whether the content was reported by the news outlet that is publishing it, or whether the news outlet is simply reprinting it from another source
Values:
originalThe content includes original reporting
reprintThe content is being reprinted from another source
URLIf a URL is present, it refers to the source from which the content is being reprinted

data-trust-sources

Description:A comma-separated list of the types of sources used to substantiate the information contained in the text
Values:
press-releaseA written press release
documentA document referenced in the text, such as a court record or an internal memo
interviewAn interview conducted by the reporter
news-reportsThe information was reported by other news outlets
observationThe reporter directly observed what is being described in the text
anonymousThe information was obtained from an unnamed source
bsSelf-explanatory

data-trust-reporter-notes

Description:A comma-separated list of methods by which the reporter recorded the information used in the text
Values:
handwrittenThe reporter took handwritten notes
typewrittenThe reporter took typewritten notes
audioThe reporter used an audio recording device

Implementation

Data-Trust attributes can be added to an online news article using HTML5's custom "data" attributes.

Because the custom "data" attributes are backwards-compatible, they will not interfere with Web browsers that do not yet support the HTML5 standard.

Below is a concise example of how Data-Trust attributes can be added to the source code of a news report:

<html>
<body data-trust-reporter-notes="handwritten,audio" data-trust-interview-methods="live,audio">
<h1>Village Green fire ruled arson</h1>
<p data-trust-sources="press-release">
A fire Tuesday that leveled a home in Village Green was intentionally set, officials said today.
</p>
<p data-trust-interview-methods="audio" data-trust-interview-translated="false" data-trust-sources="interview">
"It appears that gasoline was poured inside the garage and then ignited," said Fire Marshal Mike Smith.
</p>
<p data-trust-sources="press-release">
Damage was estimated at $250,000.
</p>
</body>
</html>

The Data-Trust attributes are not visible to the end user, but they can be "seen" by browser extensions, search engine robots, and other technologies, which can then interpret, visualize, filter, or perform other operations that can help an end user determine the trustworthiness of the news report.

Using the data

The following use cases explain how the Data-Trust attributes can be used to help readers determine the trustworthiness of a news report.

Browser extension

The popular Greasemonkey browser add-on in Mozilla Firefox and Google Chrome executes custom Javascript that can enhance or extend a user's browsing experience.

In the case of Data-Trust attributes, a Greasemonkey user script can perform on-the-fly highlighting of text based on the attributes' presence or values.

For instance, the following Greasemonkey user script detects portions of the text that have been tagged with the data-trust-sources attribute and highlights them according to their values:

// This is a Greasemonkey user script. To install it, you need // Greasemonkey 0.3 or later: http://greasemonkey.mozdev.org/ // Then restart Firefox and revisit this script. // // -------------------------------------------------------------------- // // ==UserScript== // @name Data-Trust syntax highlighter for sources // @namespace * // @include * // @require http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js // ==/UserScript== $(document).ready(function() { $(*[data-trust-sources=press-release]).css('background-color:#FEFFAF'); // yellow $(*[data-trust-sources=document]).css('background-color:#FFAFAF'); // red $(*[data-trust-sources=interview]).css('background-color:#BBFFAF'); // green $(*[data-trust-sources=news-reports]).css('background-color:#AFEAFF'); // blue $(*[data-trust-sources=observation]).css('background-color:#DDAFFF'); // purple });

Here is how a news article might appear without syntax highlighting enabled.

Here is how a news article might appear with syntax highlighting enabled.

Search engine and news aggregator filtering

Search engines that scour the Web for content using spiders and news aggregators that typically use RSS/Atom feeds to collect content can both use Data-Trust attributes to help filter news reports.

For instance, a keyword search on Google News might turn up a number of copies of the same story that were published in various news outlets. And in many cases, if a larger news outlet reprints a story that was originally published by a smaller news outlet, the larger news outlet's version of the story will appear first in the search results.

If Google News' algorithm were to give preference to stories with a data-trust-original-source attribute set to "original," however, smaller news outlets would get more exposure for the news reports that they produced.

Being able to quickly determine, using the Data-Trust schema, which news outlet is the original source of a news report will allow readers to better determine the trustworthiness of the information.