Glossary

Last updated on 2023-04-28 | Edit this page

Glossary


including tab separated (tsv), comma separated (csv), Excel (xls, xlsx), JSON, XML, RDF as XML, Google Spreadsheets

csv
A file extension indicating that a text file that has values separated by commas (comma-separated-values).
Clustering
A method for finding different groups of values that may actually be representing the same thing.
Faceting
A method for exploring the values in a variable. In this episode it is used to explore the values in order to identify errors in data entry.
Filter
To select a subset of data from a dataframe.
JSON
A file extension indicating that the values in a text file are structured using JavaScript Object Notation (JSON).
RDF
A file that extension indicating that the values in a file are structured using Resource Description Framework (RDF).
Regular expressions (regex)
A text string for describing a search pattern. They usually incorporate the use of wildcards to match letters, numbers, punctuation, spacing, or some combination.
tsv
A file extension indicating that a text file that has values separated by tabs (tab-separated-values).
xls
A file extension indicating that a file is a spreadsheet created by Microsoft Excel.
xlsx
A file extension indicating that a file is a spreadsheet created by Microsoft Excel using XML.
XML
A file extension indicating that the values in a file are structured using Extensible Markup Language (XML).