๐Ÿ˜ทBase Rate Fallacy

Why do we rely on specific information over statistics?

OWL file of the ontology

Give me a definition of "Base Rate Fallacy"

The base rate fallacy illustrates how neglecting the overall context or base rate information can lead to inaccurate perceptions and decisions.

For example, if a medical test for a disease is 95% accurate and someone tests positive, one might assume there's a 95% chance they have the disease. However, if the disease only affects 1% of the population (the base rate), the actual probability of having the disease, despite a positive test, is much lower when correctly applying Bayes' theorem, which combines the base rate with the test's accuracy to provide a more accurate assessment.

Now give me 10 example scenarios of โ€œBase Rate Fallacyโ€

  1. Medical Testing: Imagine a rare disease that affects only 2 out of every 10,000 people. A test for this disease is highly accurate, with a 99% chance of a positive result if you have the disease and a 99% chance of a negative result if you don't. If you receive a positive result, the intuitive reaction might be to assume you almost certainly have the disease, neglecting the very low base rate of the disease in the population.

  2. Email Spam Filter: An email spam filter catches 99% of spam emails but also incorrectly classifies 1% of legitimate emails as spam. Given that only 10% of the emails received are spam, if an email is marked as spam, many might overestimate the probability it's actually spam, ignoring the base rate of spam versus legitimate emails.

  3. Airport Security: Suppose an airport security system has a 95% accuracy rate at identifying a security threat but also has a 5% false positive rate in a context where actual threats are extremely rare (say 1 in a million). If a piece of luggage is flagged as a threat, the base rate fallacy would have people believe it's very likely to be a threat, disregarding the extremely low base rate of actual threats.

  4. Job Applicant Screening: An AI tool screens job applicants and is correct 90% of the time. If an applicant is flagged as a poor fit for a job, ignoring the fact that only a small percentage of applicants are actually unsuitable for the role illustrates the base rate fallacy, leading to potentially overlooking good candidates.

  5. Terrorist Identification: A surveillance system can identify terrorists with 99% accuracy and has a 1% false positive rate. Given that terrorists are extremely rare in the general population, a positive identification might lead people to overestimate the likelihood that the person identified is actually a terrorist, neglecting the base rate of terrorism.

  6. Sports Talent Scouting: A talent scout believes that because a small town has produced several major league players, a youngster from the same town has a high probability of becoming a major league player, too, ignoring the base rate that only a tiny fraction of players make it to the majors.

  7. Lottery Odds: People might overestimate their chances of winning the lottery based on stories of lottery winners, neglecting the base rate that the odds of winning are often millions to one against.

  8. Academic Achievement: Assuming a student must be a genius because they got a perfect score on a particular standardized test, without considering the base rate of how many students achieve such scores and the possible influence of other factors like test prep or guessing.

  9. Insurance Risk: A person buys extensive earthquake insurance because there was a recent earthquake, overestimating the likelihood of another earthquake happening soon, without considering the base rate of earthquake occurrences in their area.

  10. Investment Success: An investor hears about a few people making significant returns from investing in a certain type of startup. They then overestimate the success rate of such investments, ignoring the base rate that most startups fail.

Now create a user story for โ€œMedical Testingโ€ scenario

User Story: Medical Testing Scenario

Title: Improving Patient Understanding of Test Results in the Context of Rare Diseases

As a primary care physician,

I want to ensure that my patients correctly understand the implications of their medical test results, especially for rare diseases,

So that they can have an accurate perception of their health status and make informed decisions about their healthcare without unnecessary anxiety.

Scenario:

Dr. Lee is a physician who orders a test for a rare genetic condition for a patient presented with some symptoms. She has knowledge about the rarity of the condition and the accuracy of the test but despite her extensive medical knowledge, she often falls prey to the base rate fallacy, which affects her judgment. Upon receiving a positive test result Dr. Lee immediately informs the patient that they likely have the condition, not taking into full account the base rate of the disease in the population

Now, if you should make an ontology of the "Base Rate Fallacy", what classes and properties would you create?

Classes

  1. Fallacy

    • Description: Represents a logical or reasoning error.

    • Properties: Name, Description, Examples, RelatedFallacies

  2. BaseRateFallacy

    • Description: A subclass of Fallacy focusing on errors made by ignoring statistical base rates.

    • Properties: Name, Description, CommonContexts, PreventionStrategies

  3. Context

    • Description: Situations or domains where the fallacy can occur.

    • Properties: Name, Description, Domain

  4. StatisticalConcept

    • Description: Key statistical ideas relevant to understanding and avoiding the fallacy.

    • Properties: Name, Definition, Relevance

  5. DiagnosticTest

    • Description: A test or procedure to determine the presence of a condition.

    • Properties: Name, Sensitivity, Specificity, PositivePredictiveValue, NegativePredictiveValue

  6. Condition

    • Description: A disease or situation being tested for or considered in a decision-making process.

    • Properties: Name, BaseRate, Symptoms

  7. PreventionStrategy

    • Description: Strategies to avoid falling into the base rate fallacy.

    • Properties: Name, Description, ApplicationContext

Properties

  1. hasContext

    • Domain: BaseRateFallacy

    • Range: Context

  2. involvesStatisticalConcept

    • Domain: BaseRateFallacy

    • Range: StatisticalConcept

  3. utilizesDiagnosticTest

    • Domain: Context

    • Range: DiagnosticTest

  4. testsForCondition

    • Domain: DiagnosticTest

    • Range: Condition

  5. hasBaseRate

    • Domain: Condition

    • Range: float (representing a percentage or probability)

  6. hasPreventionStrategy

    • Domain: BaseRateFallacy

    • Range: PreventionStrategy

  7. appliesToContext

    • Domain: PreventionStrategy

    • Range: Context

Competency Questions and SPARQL Queries

  • What personal observations were produced by reflecting upon an activity being carried out?

Sparql Query:

PREFIX fs: <https://w3id.org/framester/data/framestercore/>
PREFIX brf: <https://raw.githubusercontent.com/ranacoskun/cognitive_bias_ontology/main/BaseRateFallacy/baseratefallacy.owl#>

SELECT DISTINCT ?observationLabel WHERE {
	?agent rdfs:label ?name.
	FILTER regex(str(?name), "Dr Lee").
	?agent exob:isEngagedIn ?assesing .
	?assesing exob:isReflectedUponIn ?observation.
	?observation rdfs:label ?observationLabel .
}
  • In which activities is Dr. Lee influenced by?

Sparql Query:

PREFIX fs: <https://w3id.org/framester/data/framestercore/>
PREFIX brf: <https://raw.githubusercontent.com/ranacoskun/cognitive_bias_ontology/main/BaseRateFallacy/baseratefallacy.owl#>

SELECT DISTINCT ?activityLabel ?engagementLabel WHERE {
	?agent rdfs:label ?name .
	FILTER regex(str(?name), "Dr Lee") .
	?agent exob:isEngagedin ?engagement .
	?engagement rdfs:label ?engagementLabel ; exob:InActivty ?activity .
	?activity rdfs:label ?activityLabel .
}
  • What type of information does the individual "Dr. Lee" engage in assessing?

Sparql Query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX exob: <https://modellingdh.github.io/ont/odp/term/>

SELECT ?activityLabel
WHERE {
    ?agent exob:isEngagedIn ?activity .
    ?agent rdfs:label ?label .
    FILTER regex(str(?label), "Dr Lee") .
    ?activity rdfs:label ?activityLabel .
}
  • What is used to make decisions?

Sparql Query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX brf: <https://raw.githubusercontent.com/ranacoskun/cognitive_bias_ontology/main/BaseRateFallacy/baseratefallacy.owl#>
PREFIX fs: <https://w3id.org/framester/data/framestercore/>

SELECT ?info
WHERE {
    ?decision rdf:type fs:Deciding .
    ?decision brf:isBasedOn ?info .
}

Classes and Properties Created

Our classes:

CognitiveBias: Represents the overarching concept of cognitive biases.

BiasedAgent: Represents an entity that experiences the bias.

BaseRateFallacy: The base rate fallacy illustrates how neglecting the overall context or base rate information can lead to inaccurate perceptions and decisions.

BaseRateInformation: Base rate information refers to the type of information that is the percentage of a population that has a specific characteristic

SpecificInformation: Specific information refers to the type of information that is case-specific

BaseRate: Base rate refers to the parameter that Base Rate Information has

Our properties:

isAffectedBy: Describes the relationship between a cognitive bias and an individual on the base of influence.

affects: Inverse of isAffectedBy.

isBasedOn: Links a decision-making activity to the information from which it is derived or based.

Chosen Framester Frames

o People

This frame contains general words for Individuals, i.e. humans. The Person is conceived of as independent of other specific individuals with whom they have relationships and independent of their participation in any particular activity. They may have an Age, Descriptor, Origin, Persistent_characteristic, or Ethnicity. A man from Phoenix was shot yesterday. She gave birth to a screaming baby yesterday. I study 16-year-old female adolescents. I am dating an African-American man. She comforted the terrified child. I always thought of him as a stupid man.

o Assessing

An Assessor examines a Phenomenon to figure out its Value according to some Feature of the Phenomenon. This Value is a factor in determining the acceptability of the Phenomenon. In some cases, a Method (implicitly involving an Assessor) is used to determine the Phenomenon's Value. Each company is then evaluated for their earning potential. CNI From the evidence of the pilot studies the risk of damage to the test subjects was rated too high to continue. CNI He weighed his options carefully.

o Information

A Cognizer knows or comes to know some piece of Information about a Topic. In this frame, many LUs encode a specific Means_of_Gathering and/or Source, but these may also be expressed separately.

o Deciding

This frame was used to model the activity of making decisions.

Chosen Content ODPs

o Experience and Observation

To represent the epistemological "missing link" between a cognitive activity, e.g. the interaction with a cultural object, and any evidence of the effects this activity has on the individuals that are engaged with it; what can collectively be considered as an experience.

o Parameter

To represent parameters to be used for a certain concept.

Other ontologies used

o rdfs:subClassOf

The subject is a subclass of a class.

Last updated