Posts + Press

Beyond Silos: Integrating Biostatistics, Data Science, and Engineering to generate rigorous evidence

Mar 2024

Introduction

The promise of personalized medicine lies in the ability to generate robust evidence. How is evidence generated and what are the best practices of collaboration between biostatisticians, data scientists, and the engineering team?

Today, I had the opportunity to interview Eric J. Daza, a renowned biostatistician and an advocate of personalized medicine through n-of-1 trials and single-case designs.

Eric Jay's diverse experiences in academia, industry, and digital health data science provide a unique perspective on the evolving landscape of data science and evidence-based medicine.

The Power of Biostatistics: Eric J. Daza Holds the Key to Evolving the Healthcare Industry

Mar 2023

This is Statistics | American Statistical Association

Many students don’t realize how much they love statistics until they take their first class. This was the case for Dr. Eric J. Daza, a health data scientist with over two decades of experience. Throughout his educational and professional experiences, Eric has learned that statistics can be a framework for life.

Among his many accomplishments, Eric has contributed to the evolution of personalized health data analysis. His innovative work at Evidation Health demonstrates how data scientists and statisticians can actually change the world for the better.

Eric is also at the forefront of the American Statistical Association (ASA) efforts to improve diversity and inclusion practices across the profession as the Professional Development Chair of ASA’s Justice, Equity, Diversity, and Inclusion (JEDI) Outreach Group.

FiT Feature: Eric Daza

Feb 2023

Filipinx in Tech.

How long have you been in tech?

I’ve been in health tech for more than four years. This is my second career, having previously spent seven years in clinical trials biostatistics.

How did you know you wanted to get into tech?

I got into health tech through a few statistically significant events. ...

Statistics Is My Dance

Nov 2022

I failed or almost failed out of a few core STEM courses in my biology undergrad — including calculus. I barely passed my core major GPA requirement to graduate, and scored a middling overall GPA.

Today, I hold a doctorate in biostatistics from a top public health school, and completed a postdoc in health behavior at a top medical school. And in 2022, I was recognized by both Forbes and Fortune for my innovative work in health data science.

How did I get here?

10 innovators shaping the future of health

Nov 2022

K Brabaw. Fortune.

Fortune has a long history of highlighting innovative leaders. That tradition continues with our inaugural list spotlighting 10 people and teams creating the future of healthcare. Many of this year’s winners are finding creative solutions to systemic healthcare problems, from dramatically lowering the costs of prescription medication, and creating greater access to mental health services for communities of color, to building an easy-to-access opioid addiction recovery program with incredible retention rates. They’re business leaders, entrepreneurs, inventors, influencers, educators, and problem solvers. Each finalist has had a major accomplishment over the last year and is using their influence to increase health and wellness access and equity. Learn more about them below.

16 Healthcare Innovators That You Should Know

Jun 2022

A Holzwarth. Forbes.

These 16 innovators are changing the face of healthcare. They are behavioral scientists, product leaders, academic researchers, statisticians, physicians, clinicians, strategists, experts in diversity equity and inclusion, implementation scientists, and tech leaders. They cross boundaries and repel traditional healthcare models. They are on the front lines and behind the scenes. And what they all have in common is their passion and commitment to innovation in health.

A Moment with Eric Daza: On N-of-1 Trials and Precision Medicine

Dec 2021

EJ Daza, A Holzwarth. Pattern Health.

A Moment with Eric Daza is part of our interview series featuring thought leaders in research and healthcare. Each interview includes 7 short and stimulating questions.

Sneak peak at the full (free-to-read) interview:

1. Tell us something we don’t know. (Anything!)

The American English word “boondocks” (as in “the boonies”) comes from the Filipino Tagalog word “bundok” (boon-DOCK). ...

2. Which fiction book would you recommend to researchers and innovators in healthcare, and why?

I recommend reading the Foundation Trilogy (if not Series) by Isaac Asimov. I had no idea I’d go into statistics as a career! ...

Consistency, Causally Speaking

Oct 2021

"Why does 'consistency' matter in causal inference?" EJ Daza. Towards Data Science.

Consistency just says that the outcome you observe is exactly the outcome you thought you would observe. You want to be sure you’re measuring what you think you’re measuring.

There’s [a] mundane violation of consistency. ... You may have been prescribed the wrong dose of medication… [or] you accidentally took two pills a day instead of one.

This happens in an RCT, too. ... Dr. A and the other physicians intentionally committed a medication error against the study protocol. And patient B and the other participants were nonadherent to the assigned treatment.

Why You Should Think of the Enterprise of Data Science More Like a Business, Less Like Science

Sep 2021

'For Eric J. Daza, “how you sell your work matters in setting you up for success.”' EJ Daza, B Huberman. Towards Data Science.

In the Author Spotlight series, TDS Editors chat with members of our community about their career path in data science, their writing, and their sources of inspiration. Today, we’re thrilled to share

Eric J. Daza, DrPH, MPS’s conversation with Ben Huberman.

Artifice or Intelligence?

Jul 2021

"Report your modeling strategy or statistical analysis plan before seeing any data". EJ Daza. Towards Data Science.

If your model isn’t performing well in prod on new data, untracked HARKing might be why. (tweet)

Imagine calling your shot in pool after you made it! That’s HARKing — a bad research habit. Preregistration is when you call each shot even before stepping up to the table. (tweet)

Ditch “Statistical Significance”

Jun 2021

"But keep statistical evidence. How? A statistician shares a writing sample". EJ Daza. Towards Data Science.

Key Concepts

“significant” p-value ≠ “significant” finding: Significance of evidence is not evidence of significance. (tweet)
"significant" p-value = "discernible" finding: Significance of evidence is evidence of discernibility.

How A 'Secret Asian Man' Embraced Anti-Racism

Sep 2020

EJ Daza. LAist.

Guest contributor Eric Daza writes about his journey from the Filipino friend who blended in and bit his tongue when encountering casual racism to embracing his own Brown-ness — and with that, calling out racism.

Significant? You Really Mean Discernible

Aug 2020

"Two common wrong phrases about statistical significance". EJ Daza. Towards Data Science.

Wrong Phrases

There was a significant decrease of D in the outcome.
There was no significant association between variables X and Y.

Confusing P-values with Clinical Impact: The Significance Fallacy

May 2020

"Significance does not imply importance — but you need it to judge quality". EJ Daza. Towards Data Science.

Main Lessons

Ask yourself if a randomized controlled trial’s reported effect size estimate is meaningful, regardless of sample size.
Train yourself to internalize that significance does not imply importance.
Remember that sample size does not correlate with effect size.
Never just say “significant” when you really mean “statistically significant”. You will be misunderstood as saying “important”. Instead, always say or write out the whole phrase “statistically significant”.

Your Coronavirus Telemedicine Health App Might Be Overrated: How to Tell

Apr 2020

"Causal inference tutorial in R using synthetic data (Part 2)". EJ Daza. Towards Data Science.

We would overstate our health app’s effectiveness by claiming it reduces the risk of new coronavirus infections by 16.9% — when in fact it will only reduce this risk by 3.1%.

But we can re-weight our real-world evidence results to provide more accurate risk-reduction estimates of either 2.3% or 2.2%.

Coronavirus, Telemedicine, and Race: Simulated Real-World Evidence

Apr 2020

"Causal inference tutorial in R using synthetic data (Part 1)". EJ Daza. Towards Data Science.

We would overstate our telemedicine app’s effectiveness by claiming it reduces the risk of new coronavirus infections by 16.9% — when in fact it will only reduce this risk by 3.1%.

The Overlooked Data Scientists in the Fight against Coronavirus: Biostatisticians

Apr 2020

EJ Daza. Towards Data Science.

This is exactly the time to temper the sprinting agility of data science with the scientifically rigorous methodology of biostatistics.

Posts | Press | Papers | Patents: Publications

Featured Papers + Patents

(NCBI Publications: click here)

Feb 2024

Methods and Systems for Generating Personalized Treatments via Causal Inference

United States Patent Application 20240047042

Provided herein are systems, methods, computer-readable media, and techniques for generating a personalized recommended intervention for a subject based on causal inference, including: obtaining a first set of time series data and a second set of time series data, the first set of time series data relating to a first variable indicative of a health behavior of the subject, and the second set of time series data relating to a second variable indicative of a health condition of the subject; determining a causal effect of the first variable on the second variable by estimating an average treatment effect, wherein the average treatment effect is estimated by processing the first set of time series data and the second set of time series data using a model-twin randomization method; and generating a personalized treatment or intervention recommendation for the subject to change the health condition based on the causal effect.

Full Text Link

Nov 2023

Patient and Physician Perspectives on the Use of a Connected Ecosystem for Diabetes Management: International Cross-Sectional Observational Study

E Benito-Garcia, J Vega, EJ Daza, W-N Lee, A Kennedy, J-M Chantelot. JMIR Formative Research.

Background

Collaboration between people with type 2 diabetes (T2DM) and their health care teams is important for optimal control of the disease and outcomes. Digital technologies could potentially tie together several health care-related devices and platforms into connected ecosystems (CES), but attitudes about CES are unknown.

Objective

We surveyed convenience samples of patients and physicians to better understand which patient characteristics are associated with higher likelihoods of (1) participating in a potential CES program, as self-reported by patients with T2DM and (2) clinical benefit from participation in a potential CES program, as reported by physicians.

Methods

Adults self-reporting a diagnosis of T2DM and current insulin use (n=197), and 33 physicians whose practices included ≥20% of such patients, were enrolled in the United States, France, and Germany. We surveyed both groups about the likelihood of patient participation in a CES. We then examined the associations between patients’ clinical and sociodemographic characteristics and this likelihood. We also described characteristics of patients likely to clinically benefit from CES use, according to physicians.

Results

Compared with patients in Germany and France, US patients were younger (mean age 45.3 [SD 11.9] years vs 61.9 [SD 9.2] and 65.8 [SD 9.4] years, respectively), more often female, more highly educated, and more often working full-time. In all, 51 (44.7%) US patients, 16 (36.4%) German patients, and 18 (46.3%) French patients indicated strong interest in a CES program, and 115 (78.7%) reported currently using ≥1 connected device or app. However, physicians believed that only 11.3%-19.2% of their patients were using connected devices or apps to manage their disease. Physicians also reported infrequently recommending or prescribing connected devices to their patients, although ≥80% (n=28) of them thought that a CES could help support their patients in managing their disease. The factors most predictive of patient likelihood of participating in a CES program were cost, inclusion of medication reminders, and linking blood glucose levels to behaviors such as eating and exercise. In all countries, the most common patient expectations for a CES program were that it could help them eat more healthfully, increase their physical activity, increase their understanding of how blood glucose relates to behavior such as exercise and eating, and reduce stress. Physicians thought that newly diagnosed patients, sicker patients—those who had been hospitalized for diabetes, were currently using insulin, or who had any comorbid condition—and patients who were nonadherent to treatment were most likely to benefit from CES use.

Conclusions

In this study, there was a high degree of interest in the future use of CES, although additional education is needed among both patients with T2DM and their physicians to achieve the full potential of such systems to improve self-management and clinical care for the disease.

Full Text Link

Aug 2022

What possibly affects nighttime heart rate? Conclusions from N-of-1 observational data

I Matias, EJ Daza, K Wac. Digital Health.

Background

Heart rate (HR), especially at nighttime, is an important biomarker for cardiovascular health. It is known to be influenced by overall physical fitness, as well as daily life physical or psychological stressors like exercise, insufficient sleep, excess alcohol, certain foods, socialization, or air travel causing physiological arousal of the body. However, the exact mechanisms by which these stressors affect nighttime HR are unclear and may be highly idiographic (i.e. individual-specific). A single-case or “n-of-1” observational study (N1OS) is useful in exploring such suggested effects by examining each subject's exposure to both stressors and baseline conditions, thereby characterizing suggested effects specific to that individual.

Objective

Our objective was to test and generate individual-specific N1OS hypotheses of the suggested effects of daily life stressors on nighttime HR. As an N1OS, this study provides conclusions for each participant, thus not requiring a representative population.

Methods

We studied three healthy, nonathlete individuals, collecting the data for up to four years. Additionally, we evaluated model-twin randomization (MoTR), a novel Monte Carlo method facilitating the discovery of personalized interventions on stressors in daily life.

Results

We found that physical activity can increase the nighttime heart rate amplitude, whereas there were no strong conclusions about its suggested effect on total sleep time. Self-reported states such as exercise, yoga, and stress were associated with increased (for the first two) and decreased (last one) average nighttime heart rate.

Conclusions

This study implemented the MoTR method evaluating the suggested effects of daily stressors on nighttime heart rate, sleep time, and physical activity in an individualized way: via the N-of-1 approach. A Python implementation of MoTR is freely available.

Full Text Link

Aug 2022 (In Preparation)

Model-Twin Randomization (MoTR): A Monte Carlo Method for Estimating the Within-Individual Average Treatment Effect Using Wearable Sensors (Pre-Print)

EJ Daza, L Schneider. arXiv.

Temporally dense single-person "small data" have become widely available thanks to mobile apps and wearable sensors. Many caregivers and self-trackers want to use these data to help a specific person change their behavior to achieve desired health outcomes. Ideally, this involves discerning possible causes from correlations using that person's own observational time series data. In this paper, we estimate within-individual average treatment effects of physical activity on sleep duration, and vice-versa. We introduce the model twin randomization (MoTR; "motor") method for analyzing an individual's intensive longitudinal data. Formally, MoTR is an application of the g-formula (i.e., standardization, back-door adjustment) under serial interference. It estimates stable recurring effects, as is done in n-of-1 trials and single case experimental designs. We compare our approach to standard methods (with possible confounding) to show how to use causal inference to make better personalized recommendations for health behavior change, and analyze 222 days of Fitbit sleep and steps data for one of the authors.

Full Text Link

May 2022

Estimating the Burden of Influenza-like Illness on Daily Activity at the Population Scale Using Commercial Wearable Sensors

A Mezlini, A Shapiro, EJ Daza, E Caddigan, E Ramirez, T Althoff, L Foschini. JAMA Network Open.

Question: How can the true burden of influenza-like illnesses (ILIs) be estimated given that most cases of ILIs are mild and go undocumented?

Findings: This cohort study of 15 122 adults who reported ILI symptoms and had data from wearable sensors at symptom onset found an overall reduction in mobility equivalent to 15% of the active US population becoming completely immobilized for 1 day. More than 60% of this reduction occurred among persons who had sought no medical care.

Meaning: This study suggests that the burden of ILIs is much greater than had previously been understood.

Full Text Link

Feb 2021

Creating Evidence from Real World Patient Digital Data

Editors: J Nikles, EJ Daza, S McDonald, E Hekler, NJ Schork. Frontiers in Psychiatry, Psychology, Digital Health, Neurology, Public Health, and Sociology.

N-of-1 randomized controlled trials (RCTs) provide an opportunity to evaluate individual patient response to interventions, by randomly allocating different time periods within an individual to repeated intervention and control conditions and comparing responses. N-of-1 observational studies involve the repeated measurement of an outcome (e.g. pain) in a patient over time, but with no intervention implemented, in order to draw conclusions about naturally-occurring patterns and predictors of outcomes over time.

Both N-of-1 RCTs and observational studies can have a ‘self-study’ design, where an individual conducts the study on themselves, to answer research questions they have generated themselves. N-of-1 RCTs and observational studies provide individualized findings that can be aggregated to produce results equivalent to those found in traditional group-based RCTs and population-level epidemiological studies, respectively, but requiring fewer patients for the same power. N-of-1 RCTs and observational studies are well-suited to complement, strengthen, and generate advances in precision medicine, patient-centred healthcare, and personalised health. Since 2015, the number of N-of-1 articles has doubled annually.

Similarly, digital health is an exploding field, with over 1,000 studies registered on clinicaltrials.gov. Digital health, and digital therapeutics in particular, complement N-of-1 RCTs and observational studies by providing relevant individualized health data from, for example, worn sensors, implants, regular lab assays, or -omics sequencing. Such data can be compared to population-health databases to target a patient’s strongest possible treatment option (as in cancer-risk studies) and, in turn, inform the design of an N-of-1 RCT to evaluate it. Digital health data can also be continuously monitored during the study itself and used to help tailor a treatment to the needs and preferences of patients in real time.

This Research Topic will cover digital health applications, delivery, and analysis of N-of-1 RCTs and observational studies (including self-studies) in any health discipline. The focus is on:

mobile health (mHealth) and applications (apps)
wearable devices, sensors and implants,
real-time tracking, data analytics and online registries,
patient experience of digital health and mobile health, patients as collaborators in personalised medicine, self-tracking in citizen science, etc.

The articles can be original research, methodology pieces, opinion pieces, reviews, systematic reviews, protocols, short reports, or case studies.

Full Issue Link

Jan 2020

Effects of sleep deprivation on blood glucose, food cravings, and affect in a non-diabetic: An n-of-1 randomized pilot study

EJ Daza, K Wac, M Oppezzo. Healthcare.

Sleep deprivation is a prevalent and rising health concern, one with known effects on blood glucose (BG) levels, mood, and calorie consumption. However, the mechanisms by which sleep deprivation affects calorie consumption (e.g., measured via self-reported types craved food) are unclear, and may be highly idiographic (i.e., individual specific). Single-case or “n-of-1” randomized trials (N1RT) are useful in exploring such effects by exposing each subject to both sleep deprivation and baseline conditions, thereby characterizing effects specific to that individual. We had two objectives: (1) To test and generate individual-specific N1RT hypotheses of the effects of sleep deprivation on next-day BG level, mood, and food cravings in two non-diabetic individuals; (2) To refine and guide a future n-of-1 study design for testing and generating such idiographic hypotheses for personalized management of sleep behavior in particular, and for chronic health conditions more broadly. We initially did not find evidence for an idiographic effect of sleep deprivation, but better-refined post hoc findings indicate that sleep deprivation may have increased BG fluctuations, cravings, and negative emotions. We also introduce an application of mixed-effects models and pancit plots to assess idiographic effects over time.

Full Text Link

Jan 2019 (In Preparation)

Person as population: A longitudinal view of single-subject causal inference for analyzing self-tracked health data (Pre-Print)

EJ Daza. arXiv.

Single-subject health data are becoming increasingly available thanks to advances in self-tracking technology (e.g., mobile devices, apps, sensors, implants). Many users and health caregivers seek to use such observational time series data to recommend changing health practices in order to achieve desired health outcomes. However, there are few available causal inference approaches that are flexible enough to analyze such idiographic data. We develop a recently introduced framework, and implement a flexible random-forests g-formula approach to estimating a recurring individualized effect called the "average period treatment effect". In the process, we argue that our approach essentially resembles that of a longitudinal study by partitioning a single time series into periods taking on binary treatment levels. We analyze six years of the author's own self-tracked physical activity and weight data to demonstrate our approach, and compare the results of our analysis to one that does not properly account for confounding.

Full Text Link

Feb 2018

Causal analysis of self-tracked time series data using a counterfactual framework for n-of-1 trials

EJ Daza. Methods of Information in Medicine.

I'm very proud of this piece. It's clunky, lumbering, and overwrought. Still, I hope I did the source material justice in my first true (impostor-syndromic) attempt at telling an honest story of a single person's health habits through the language of doubt.

Conclusions. Causal analysis of an individual's time series data can be facilitated by an n-of-1 randomized trial counterfactual framework. However, for inference to be valid, the veracity of certain key assumptions must be assessed critically, and the hypothesized causal models must be interpretable and meaningful.

Full Text Document (LaTeX version; click title above for official version)

Sep 2017

Thyroid cancer mortality is higher in Filipinos in the United States: An analysis using National Mortality Records from 2003 through 2012

ML Nguyen, J Hu, K Hastings, E Daza, M Cullen, L Orloff, L Palaniappan. Cancer.

Conclusions. Negative prognostic factors for thyroid cancer traditionally include age >45 years and male sex. The results of the current study demonstrate that Filipinos die of thyroid cancer at higher rates than NFA and NHW individuals of similar ages. Highly educated Filipinos and Filipino women may be especially at risk of poor thyroid cancer outcomes. Filipino ethnicity should be factored into clinical decision making in the management of patients with thyroid cancer.

Full Text Link

Jul 2017

Estimating inverse-probability weights for longitudinal data with dropout or truncation: The xtrccipw command

EJ Daza, MG Hudgens, AH Herring. The Stata Journal.

Individuals may drop out of a longitudinal study, rendering their outcomes unobserved but still well defined. However, they may also undergo truncation (for example, death), beyond which their outcomes are no longer meaningful. Kurland and Heagerty (2005, Biostatistics 6: 241–258) developed a method to conduct regression conditioning on nontruncation, that is, regression conditioning on continuation (RCC), for longitudinal outcomes that are monotonically missing at random (for example, because of dropout). This method first estimates the probability of dropout among continuing individuals to construct inverse-probability weights (IPWs), then fits generalized estimating equations (GEE) with these IPWs. In this article, we present the xtrccipw command, which can both estimate the IPWs required by RCC and then use these IPWs in a GEE estimator by calling the glm command from within xtrccipw. In the absence of truncation, the xtrccipwcommand can also be used to run a weighted GEE analysis. We demonstrate the xtrccipw command by analyzing an example dataset and the original Kurland and Heagerty (2005) data. We also use xtrccipw to illustrate some empirical properties of RCC through a simulation study.

Full Text Link (PMC/HHS version; click title above for official version)

Jan 2017

A Bayesian approach to the g-formula

AP Keil, EJ Daza, SM Engel, JP Buckley, JK Edwards. Statistical Methods in Medical Research.

Epidemiologists often wish to estimate quantities that are easy to communicate and correspond to the results of realistic public health interventions. Methods from causal inference can answer these questions. We adopt the language of potential outcomes under Rubin’s original Bayesian framework and show that the parametric g-formula is easily amenable to a Bayesian approach. We show that the frequentist properties of the Bayesian g-formula suggest it improves the accuracy of estimates of causal effects in small samples or when data are sparse. We demonstrate an approach to estimate the effect of environmental tobacco smoke on body mass index among children aged 4–9 years who were enrolled in a longitudinal birth cohort in New York, USA. We provide an algorithm and supply SAS and Stan code that can be adopted to implement this computational approach more generally.

Full Text Link

Dec 2016

Online patient-provider e-cigarette consultations: Perceptions of safety and harm

CG Brown-Johnson, A Burbank, EJ Daza, A Wassmann, A Chieng, GW Rutledge, JJ Prochaska. American Journal of Preventive Medicine.

Conclusions. Examination of online patient–provider communications provides insight into consumer health experience with emerging alternative tobacco products. Patient concerns largely related to harms and safety, and patients preferred provider responses positively inclined toward e-cigarettes. Lacking conclusive evidence of e-cigarette safety or efficacy, healthcare providers encouraged smoking cessation and recommended first-line cessation treatment approaches.

Full Text Link

May 2016

Likelihood of unemployed smokers vs nonsmokers attaining reemployment in a one-year observational study

JJ Prochaska, AK Michalek, C Brown-Johnson, EJ Daza, M Baiocchi, N Anzai, A Rogers, M Grigg, A Chieng. JAMA Internal Medicine.

Conclusions and Relevance. To our knowledge, this is the first study to prospectively track reemployment success by smoking status. Smokers had a lower likelihood of reemployment at 1 year and were paid significantly less than nonsmokers when reemployed. Treatment of tobacco use in unemployment service settings is worth testing for increasing reemployment success and financial well-being.

Full Text Link

Posts | Press | Papers | Patents: Publications