Learn something new every day
More Info... by email
Raw data, also known as source data or atomic data, is information that has not been processed in order to be displayed in any sort of presentable form. The raw form may look very unrecognizable and be nearly meaningless without processing, but it may also be in a form that some can interpret, depending on the situation. This data can be processed manually or by a machine.
In some cases, raw data may be nothing more than a series of numbers. The way those numbers are sequenced, however, and sometimes even the way they are spaced, can be very important information. A computer may interpret this information and give a readout that then may make sense to the reader.
Binary code is a good example of raw data. Taken by itself as a printout, a binary code does very little for the computer user — at least the vast majority of users. When it is processed through a computer, on the other hand, it provides more understandable information. In fact, binary code is typically the source code for everything a computer user sees.
In some cases, this type of data may never be seen in its final form, especially by those working in data entry applications. In these situations, the user is responsible only for entering the information and sometimes, the person entering the data may not even know exactly what he is entering or why. This is especially helpful when security or privacy is important because it helps ensure no worker inserts any biased or intentionally false information for the purposes of hurting or benefiting someone.
For example, in some medical applications, there can be very strict regulations regarding patient privacy, yet the data still may need to be entered into a database. To prevent as many people as possible from identifying the patients, each may be assigned a number. Their conditions may also be assigned a number, as well as their treatment options. Without the knowledge of what those numbers mean, there is no way to identify the patient or condition. That identifying information may only be available to a handful of people.
This example is actually pretty unusual, since information is rarely converted into a form considered raw. Instead, raw data is usually processed to make it more refined. There are many different applications where unprocessed data appears, however, and the rules regarding what to do with it depend on the situation.
@allenJo - That’s a good point. Another example would be people working in the intelligence industry. They get tons of tips, leads, chatter, email, phone calls conversations, etc., every day that they have to filter through and convert into “action-able intelligence.” It’s not easy by any stretch of the imagination, but it can be done.
I think this excellent raw data definition serves as a good jumping board for a basic discussion of data analysis. The important point to make from the beginning is the distinction between data and information. Data is what the computer (or whatever machine) provides. Information is what we think is important.
A good example of this is found in economics. Every day we get subjected to raw statistical data about interest rates, raw stock data, unemployment data, payroll numbers, etc.
However, it’s the job of the economist, and other analysts, to answer the question, “What does it all mean?” Of course people will come to different conclusions based on what models they use. But the data itself isn’t useful without some context to make it meaningful.