Home / Statistical Tools / Data Types / Data Types Details
Data Types Details¶
Excel does not enforce data types in cells—users can enter text, numbers, dates, or any combination in any cell. For statistical analysis, however, data types matter. You cannot calculate a mean on text values, and date grouping only makes sense for date/time data.
Quantum XL solves this by letting you specify the data type for each column. When you import data, Quantum XL automatically detects the most likely type for each column, but you can change it if needed.
Column Data Types¶
Quantum XL supports five column data types:
| Data Type | Description | Example Values | Use For |
|---|---|---|---|
| Continuous | Any numerical value (integers or decimals, positive or negative) | 1.5, -3.14, 100, -5 | Measurements, calculated values, any numerical data |
| Count | Non-negative whole numbers | 0, 1, 42, 1000 | Defect counts, occurrences, quantities |
| Integer | Whole numbers (positive or negative) | -5, 0, 42 | Discrete values that can be negative |
| DateTime | Date and/or time values | 1/15/2025, 3:30 PM | Time-series data, dates, timestamps |
| Nominal | Typically text values | "Red", "Blue", "Pass", "Fail" | Categories, labels, names, groups |
Understanding Nominal Data
In statistics, nominal data represents categories or labels where the values have no inherent order or numerical meaning. The term comes from the Latin word "nomen" meaning "name"—nominal data simply names or labels things.
Key characteristics:
- Values represent distinct categories with no ranking or order
- Mathematical operations (mean, sum) are meaningless—you can only count occurrences
- Two values are either the same category or different categories
Nominal data doesn't have to be text. If you have machines labeled 1, 2, 3, 4, these numbers are nominal—Machine 2 is not "greater than" Machine 1, and averaging Machine 1 + Machine 3 = Machine 2 makes no sense. The numbers are just convenient labels.
Other examples of numeric nominal data:
- ZIP codes (90210, 10001)
- Product SKUs (1001, 1002, 1003)
- Employee IDs (E001, E002)
- Phone numbers
When Quantum XL detects a column as Nominal, it treats each unique value as a distinct category for grouping and counting, regardless of whether the values are text or numbers.
Automatic Type Detection¶
When you select data for analysis, Quantum XL examines each column and automatically suggests a data type based on the values it contains.
How Detection Works¶
For each column, Quantum XL counts how many values can be interpreted as each type. If more than 50% of the non-empty values in a column can be parsed as a particular type, that type is selected.
The detection follows this priority order:
- DateTime — If more than half the values are cells formatted as dates in Excel OR text that parses as a date
- Count — If more than half the values are non-negative whole numbers AND there are no negative integers
- Integer — If more than half the values are whole numbers (including negatives)
- Continuous — If more than half the values are numbers (including integers and decimals)
- Nominal — Default if none of the above conditions are met
DateTime Detection Details¶
Quantum XL recognizes dates in two forms:
- Cells formatted as dates in Excel — When Excel stores the value as a date (not just a number that looks like a date)
- Text that parses as a date — Strings like "January 15, 2025", "1/15/2025", or "2025-01-15"
Plain numbers are not treated as dates, even if they fall within Excel's valid date range. This prevents numeric data like "42000" from being incorrectly detected as a date. If you have a column of Excel serial dates that appears as numbers, you should either format them as dates in Excel before analysis, or manually set the column type to DateTime.
How Excel Stores Dates Internally
Excel stores dates as serial numbers counting days since January 1, 1900:
- Day 1 = January 1, 1900
- Day 2 = January 2, 1900
- Day 42000 = December 25, 2014
Times are stored as decimal fractions of a day:
- 0.5 = 12:00 PM (noon, halfway through the day)
- 42000.5 = December 25, 2014 at 12:00 PM
- 42000.75 = December 25, 2014 at 6:00 PM
Key limitations:
- Negative numbers cannot be dates — The serial date system starts at 1, so values like -5 or 0 are not valid dates
- No dates before January 1, 1900 — Excel cannot store dates earlier than this in numeric format (though you can type them as text strings)
The date format is just display formatting, not the underlying value. You can type the number 42000 into any Excel cell, then apply a date format (Format Cells → Date), and Excel will display "12/25/2014" or similar depending on your locale. The cell still contains the number 42000—the formatting just tells Excel how to display it.
The 1900 Leap Year Bug
Excel incorrectly treats 1900 as a leap year, even though it was not. This means Excel believes February 29, 1900 exists (serial date 60), when in reality that date never occurred. This bug was inherited from Lotus 1-2-3 for compatibility reasons, and Microsoft has documented that they will not fix it because doing so would break compatibility with existing spreadsheets. See Microsoft's documentation of this issue.
How Quantum XL corrects for this: When Quantum XL reads dates from Excel, it automatically detects dates in the affected range (January 1, 1900 through February 28, 1900) and shifts them forward by one day to correct for the bug. This ensures that dates are interpreted correctly for statistical analysis. You do not need to take any action—the correction happens automatically.
Example Detection Scenarios¶
| Column Contains | Detected Type | Reason |
|---|---|---|
| 1.5, 2.3, 4.7, 3.1 | Continuous | All values are decimals |
| 1, 2, 3, 4, 5 | Count | All values are non-negative whole numbers |
| -2, 0, 3, -1, 5 | Integer | Values include negative whole numbers |
| Red, Blue, Green | Nominal | All values are text |
| 1/1/2025, 1/2/2025, 1/3/2025 | DateTime | All values are dates |
| 8:30 AM, 2:15 PM, 11:00 AM | DateTime | All values are times |
| 1/15/2025 8:30 AM, 1/16/2025 2:15 PM | DateTime | All values are dates with times |
| 1, 2, "Cat", 3, 4 | Continuous | 80% are numbers (4 of 5 non-empty) |
| 1, "Dog", "Cat", "Bird" | Nominal | Only 25% are numbers |
Changing Column Types¶
You can override the automatically detected type by selecting a different type in the data source configuration. When you change a column's type, values that cannot be converted to the new type are filtered out (treated as empty).
Conversion Rules¶
When you change a column to a specific type, each value is validated against that type:
Continuous¶
- Valid: Any value that can be parsed as a number (integers or decimals)
- Invalid (filtered out): Text that is not a number, dates
Count¶
- Valid: Non-negative whole numbers, including decimals that equal whole numbers (e.g., 5.0 → 5)
- Invalid (filtered out): Negative numbers, text, decimals with fractional parts, dates
Integer¶
- Valid: Whole numbers (positive or negative), including decimals very close to whole numbers (e.g., 1.9999999 → 2 due to floating-point tolerance)
- Invalid (filtered out): Text, decimals with significant fractional parts, dates
DateTime¶
- Valid: Date/time values, text parseable as dates, Excel serial date numbers
- Invalid (filtered out): Text that cannot be parsed as a date, numbers outside the valid date range
Nominal¶
- Valid: Almost any value (converted to text)
- Invalid (filtered out): Null values, empty strings, whitespace-only strings
Data Loss When Changing Types
Changing a column's type may cause some values to be filtered out. For example, if a column contains "Dog", 1, 1.34, -1 and you change it to Count, only the value 1 will remain—"Dog", 1.34, and -1 will be treated as empty because they are not non-negative whole numbers.
Type Change Examples¶
Given a column with these values: "Dog", 1, 1.34, -1
| Change To | Values Kept | Values Filtered |
|---|---|---|
| Continuous | 1, 1.34, -1 | "Dog" |
| Count | 1 | "Dog", 1.34, -1 |
| Integer | 1, -1 | "Dog", 1.34 |
| DateTime | None | All values |
| Nominal | "Dog", "1", "1.34", "-1" (all as text) | None |
How Filtering Works with Multiple Columns¶
When a value is filtered out, it becomes empty in that cell only—rows do not shift up or get deleted. This is important to understand when your data has multiple columns, because other columns in the same row remain unchanged.
Example 1: Changing to Integer¶
Your original data:
| Row | Region | Value |
|---|---|---|
| 1 | North | 1.21 |
| 2 | South | 3 |
| 3 | East | 5 |
If you set the Value column to Integer, the decimal 1.21 cannot be converted and becomes empty:
| Row | Region | Value |
|---|---|---|
| 1 | North | (empty) |
| 2 | South | 3 |
| 3 | East | 5 |
Notice that "North" stays in row 1—the row does not shift up or disappear.
Example 2: Mixed Invalid Values¶
Your original data:
| Row | Category | Measurement |
|---|---|---|
| 1 | A | 10.5 |
| 2 | B | N/A |
| 3 | C | 20.3 |
| 4 | D | Error |
| 5 | E | 15.0 |
If you set the Measurement column to Continuous, the text values "N/A" and "Error" become empty:
| Row | Category | Measurement |
|---|---|---|
| 1 | A | 10.5 |
| 2 | B | (empty) |
| 3 | C | 20.3 |
| 4 | D | (empty) |
| 5 | E | 15.0 |
The Category column is completely unaffected—"B" and "D" remain in their original rows.
Example 3: Filtering in the First Column¶
Your original data:
| Row | Code | Description |
|---|---|---|
| 1 | 101 | Widget |
| 2 | ABC | Gadget |
| 3 | 102 | Tool |
If you set the Code column to Count, the text "ABC" becomes empty:
| Row | Code | Description |
|---|---|---|
| 1 | 101 | Widget |
| 2 | (empty) | Gadget |
| 3 | 102 | Tool |
"Gadget" remains in row 2, paired with an empty Code value.
Why This Matters
Understanding that filtering creates empty cells (not deleted rows) is important for interpreting your analysis results. Rows with empty values in a column are typically excluded from calculations involving that column, but may still be included in other parts of the analysis.
Data Cleaning Rules¶
Quantum XL automatically cleans data when it is imported:
Empty Values¶
The following are treated as empty (missing data):
- Null cells
- Cells containing only whitespace (spaces, tabs)
- Excel error codes:
#DIV/0!,#N/A,#NAME?,#NULL!,#NUM!,#REF!,#VALUE!
Whitespace Handling¶
- Leading and trailing whitespace is trimmed from column names
- Whitespace-only cell values are replaced with empty (treated as missing data)
- For Nominal columns, whitespace-only values are filtered out during analysis
Excel Date Handling¶
Excel has a known issue where it incorrectly treats 1900 as a leap year (February 29, 1900 did not exist). Quantum XL automatically corrects for this when converting dates from Excel, ensuring dates from January 1, 1900 through February 28, 1900 are handled correctly.
Choosing the Right Type¶
| If Your Data Represents... | Choose |
|---|---|
| Measurements (length, weight, temperature) | Continuous |
| Counts of items or events (defects, occurrences) | Count |
| Discrete values that can be negative (scores, differences) | Integer |
| Dates, times, or timestamps | DateTime |
| Categories or groups (colors, product names, pass/fail) | Nominal |
When in Doubt
If you're unsure which type to use, consider how the data will be analyzed:
- Continuous data can be used for calculations like mean, standard deviation, and regression
- Count data is appropriate when values represent discrete occurrences and cannot be negative
- Integer data is similar to Count but allows negative values
- DateTime data enables time-based grouping—see DateTime Grouping for details on grouping by year, month, week, day, hour, minute, or second
- Nominal data is used for grouping and categorical analysis