{"id":117,"date":"2026-02-09T19:21:12","date_gmt":"2026-02-10T00:21:12","guid":{"rendered":"https:\/\/health.uconn.edu\/causality\/?page_id=117"},"modified":"2026-02-16T16:22:34","modified_gmt":"2026-02-16T21:22:34","slug":"spatialcausal","status":"publish","type":"page","link":"https:\/\/health.uconn.edu\/causality\/spatialcausal\/","title":{"rendered":"Causality in disparities spatial analytics"},"content":{"rendered":"<p><strong><em>Causality in disparities spatial analytics <\/em><\/strong><\/p>\n<p><em>The conundrum and common solutions\u00a0<\/em><strong>Na\u00efve<\/strong>\/aspatial <strong>methods<\/strong> analyzing spatial data <strong>overestimate effects<\/strong> because observations inherently have <em>spatial autocorrelation<\/em>; for instance, life expectancy shows a classic (na\u00efve) Pearson correlation even with the statistically meaningless FIPS census tract identifier (<a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/34910138\/\">Pubmed<\/a>),\u00a0 yet a proper spatial lag regression correctly yields no such relation.<\/p>\n<p>There is a <strong>long tradition<\/strong> of analyzing and interpreting spatial data, e.g. from agriculture and economics to astrophysics <a href=\"https:\/\/www.jstor.org\/stable\/j.ctv1pdrpsj\" class=\"broken_link\">Jstor<\/a> (Stigler, 1999, p. 193). Yule (1899 <a href=\"https:\/\/www.jstor.org\/stable\/2979889\" class=\"broken_link\">Jstor<\/a>) for example has reported in 1899 on factors responsible for changes in poverty (\u2018pauperism\u2019), using census data from England and plain (na\u00efve, i.e. not spatial) regressions (his study is showcased in Freedman <a href=\"https:\/\/onlinelibrary.wiley.com\/doi\/abs\/10.1002\/0470013192.bsa598\">Jstor<\/a> (2005). <a href=\"#_edn1\" name=\"_ednref1\"><span>[i]<\/span><\/a><\/p>\n<p><strong>Any data<\/strong> collected <strong>on Earth<\/strong> is <strong>spatial<\/strong>: they come from specific locations on Earth. <a href=\"#_edn2\" name=\"_ednref2\"><span>[ii]<\/span><\/a><\/p>\n<p><strong>Few studies<\/strong> <strong>test<\/strong> the extent of <strong>spatial nonindependence<\/strong> and then apply proper spatial analytical tools; most rely on na\u00efve or a-spatial analyses. Few, like Kramer et al. (Kramer, Black, Matthews, &amp; James, 2017 <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/29226214\/\">Pubmed<\/a>), for example, reported upfront their main variables\u2019 Moran\u2019s Is, and then analyzed the spatial data using spatial analytic models, which directly account for the spatial \u2018contagion\u2019 effect.<\/p>\n<p><em>Modeling spatial non-independence with modern tools <\/em>The independence of data points is a common assumption in most analytical models. When <strong>data<\/strong> is \u2018<strong>relational<\/strong>\u2019, i.e. data points (persons, regions) are related in some manner, their contribution to the analysis literally diminishes. At one extreme, if for example two spouses respond completely identically to the question \u2018How many children you have?\u2019) their responses become one data point, rather than two. Analyses ignoring this data \u2018coupling\u2019 would yield a na\u00efve\/non-dyadic biased view (Kenny et al., 2010. <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/20005618\/\">Pubmed<\/a>).\u00a0 The same logic applies to over-time data, like height measured after reaching adulthood: height doesn\u2019t need to be measured twice when similarity across time is expected. I delve below into the intricacies of how and why spatial structuring artificially magnifies the na\u00efve associations, <a href=\"#_edn3\" name=\"_ednref3\"><span>[iii]<\/span><\/a> but I provide a quick preview of the solution: any spatial outcome analyzed has behind it a spatial \u2018autocorrelation\u2019 effect co-occurring, without which the effect of any predictor will artificially appear inflated. Such data structure is very similar to dyadic and \u2018related samples\u2019 repeated measurements. <a href=\"#_edn4\" name=\"_ednref4\"><span>[iv]<\/span><\/a><\/p>\n<p>The non-independence among data points (here areas or regions) has been termed \u2018<strong>contagion<\/strong>\u2019 between cases (or \u2018interference\u2019, <a href=\"https:\/\/www.jstor.org\/stable\/43288499\" class=\"broken_link\">Jstor<\/a>, spatial \u2018interaction\u2019 <a href=\"https:\/\/www.jstor.org\/stable\/2984812\" class=\"broken_link\">Jstor<\/a> , or \u2018confounding due to location\u2019 <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/8144305\/\">Pubmed<\/a>. The manner in which spatial \u2018auto\u2019-correlation <a href=\"#_edn5\" name=\"_ednref5\"><span>[v]<\/span><\/a>affects statistical estimates is rarely conveyed intuitively. Firstly, when non-independence of individual cases is operating, the expected value is <em>not<\/em> the arithmetic mean anymore, but becomes a weighted mean.<a href=\"#_edn6\" name=\"_ednref6\"><span>[vi]<\/span><\/a> From a causal inference standpoint, this contagion effect implies that a certain region targeted by a policy-driven intervention has the potential of affecting (and be affected by!) its neighboring regions, above and beyond the intervention. Although statistical independence and lack of a causal relation are distinct concepts, when data nonindependence occurs, one must correct for it before investigating causal relations. Inferring causal relations from observed (plain) correlations between variables, and pursuing their sources can be achieved\u00a0 using the classic method of path analysis (Wright, 1921, <a href=\"https:\/\/www.scirp.org\/reference\/referencespapers?referenceid=1706123\">Scirp<\/a>), see also <a href=\"https:\/\/www.researchgate.net\/profile\/Lee-Wolfle\/publication\/254312746_Sewall_Wright_on_the_method_of_path_coefficients_An_annotated_bibliography\/links\/55b77d3e08ae9289a08be8a2\/Sewall-Wright-on-the-method-of-path-coefficients-An-annotated-bibliography.pdf\" class=\"broken_link\">Annotated bibliography<\/a>) .<\/p>\n<p>Most <strong>reports<\/strong> in medical and health literature are based on analyses that<strong> effectively <em>drop<\/em> the spatial structure<\/strong> and only retain the simple table-like component of the spatial data, (what Geographic Information Systems, GIS, analysts call the \u2018attribute table\u2019). <a href=\"#_edn7\" name=\"_ednref7\">[vii]<\/a> I provide in an online document at <a href=\"http:\/\/tinyurl.com\/SPATIALPRPR\">tinyurl.com\/SPATIALPRPR<\/a> a list of analyses (In Stata) that advance from simple (na\u00efve) regressions to multilevel spatial structural models; I also built a more intuitive deconstruction, using plain Excel, and showed how to add the matrix of neighboring relations to the table data, and manually run spatial lag regressions: <a href=\"https:\/\/tinyurl.com\/blogstats1\">Tinyurl.com\/BLOGSTATS1<\/a> .<\/p>\n<p>Unfortunately, the classical <strong>explanations of nonindependence<\/strong> focus on groups of cases, in the Anova-tradition, or on 1:1 (dyadic) designs, and do not transplant well into the spatial \u2018mutual dependence\u2019 context. Whereas groups like schools are generally mutually exclusive, that is, each student (a case) belongs generally to only one school, with spatial data the \u2018group\u2019 (or cluster) that each case \u2018belongs to\u2019 is made up of <em>several<\/em> other cases in the same dataset, and these \u2018groups\u2019 overlap many times over, similar to individuals belonging to several friendship groups. This means conversely that each region acts as a \u2018group member\u2019 in as many such \u2018groups\u2019 as the number of its spatial neighbors, because each region defines its own group, by rounding up its neighbors. Using the example of the US state of Connecticut (CT), MA is in the group of CT\u2019s neighbors, but CT also counts as MA\u2019s neighbor. CT has three neighbors: NY, MA, and RI, so CT\u2019s life expectancy (LifeExp<sub>CT<\/sub>) is influenced by all the other 3 neighboring states\u2019 life expectancies, due to a host of social processes (including e.g. residents moving between the states), so we can say for instance that LifeExp<sub>(NY &amp; MA &amp; RI)<\/sub>\u00a0 -&gt; LifeExp<sub>CT<\/sub>, but because CT is one of the 5 neighbors of MA (along with NH, VT, NY, and RI), we also have LifeExp<sub>(CT &amp; NH &amp; VT &amp; NY &amp; RI)<\/sub>\u00a0 -&gt; LifeExp<sub>MA<\/sub>, which makes visible the fact that there are feedback-loop relations arising between neighboring states, due to their spatial adjacency (\u2018self-reinforcing\u2019 effects). Figure 1 shows how such influences flow, by portraying only CT and MA: only same \u2018self\u2019 <em>cross<\/em>-variable effects are considered in na\u00efve analyses (the common association setup: CT has higher than mean value on a variable, and also higher than mean on another, etc.), while spatial analyses also model the same variable effects: how CT and MA values of the same variable relate, and why.<\/p>\n<p>Fig. 1. Sources of na\u00efve (N) and spatial (S) effects; two US neighboring states <a href=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig1_.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig1_-300x97.png\" alt=\"\" width=\"532\" height=\"172\" class=\"wp-image-122 aligncenter\" srcset=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig1_-300x97.png 300w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig1_.png 534w\" sizes=\"(max-width: 532px) 100vw, 532px\" \/><\/a><\/p>\n<p><em>Notes<\/em>: Two regions with <em>Self &amp; Other region <\/em>links and with <u>Same &amp; Cross variable <\/u>links: sources of plain\/na\u00efve\/a-spatial (N:) correlations &#8211; na\u00efve &#8211; continuous curved arrows; interrupted arrows: links due to spatial structure (S:); other-region &amp; same-variable vertical arrows: the basis for \u2018auto\u2019-correlations (S:<em>Other.<\/em><u>XSame<\/u>); diagonal arrows (S:<em>Other.<\/em><u>Cross<\/u>) are other-region &amp; cross-variable influences (generally not modeled): CT = Connecticut, MA = Massachusetts; LifeExp: Life Expectancy.<\/p>\n<p>The inherent spatial structure results in \u2018<strong>excess similarity<\/strong>\u2019 among neighboring regions beyond mere randomness, and explains why the expected level of spatial independence (e.g., Moran\u2019s I) is not zero, but E(I) = (-1)\u2044(n-1), (see proof in Griffith\u2019s Appendix, <a href=\"https:\/\/search.worldcat.org\/title\/15489422\">Worldcat<\/a>) . <a href=\"#_edn8\" name=\"_ednref8\"><span>[viii]<\/span><\/a>This peculiarity of spatial data is rarely fully explained: the null hypothesis \u2018position\u2019 is shifted away from zero, often \u2018upwards\u2019, and hence the statistical test, in this binary variables case the chi-squared test, is more likely to conclude the presence of a <em>na\u00efve<\/em> relationship, when no true <em>spatial<\/em> relationship is there.<\/p>\n<p>Spatial data therefore inherently contain at least <em>three types of \u2018<strong>clusterings<\/strong>\u2019<\/em>, and they all may shift the na\u00efve estimates of effects, in rather complex ways.<\/p>\n<p><strong>(i). <\/strong>Individual residents are \u2018clustered\u2019, e.g. in census tracts (Gelman, Shor, Bafumi, &amp; Park, 2008, <a href=\"https:\/\/drive.google.com\/file\/d\/1uWSiEzy42m7XHS7xGi_Sh8WQLbs0-jK7\/view?usp=sharing\">GDrive<\/a>) ; this is the classic \u2018students clustered in schools\u2019 setup from multilevel designs;<\/p>\n<p><strong>(ii).<\/strong> Census tracts themselves are \u2018nested\u2019 in counties. Note that with areal data, <em>all<\/em> lower-level regions (census tracts) are included in the data, i.e. there is no sampling of them: the whole population spatial data availability creates another unique statistical challenge, because probabilities associated with classic estimates are not equally meaningful. The spatial structuring of the data adds a third layer of \u2018clustering\u2019, based on<\/p>\n<p><strong>(iii).<\/strong> The neighboring or adjacency natural spatial relation (\u2018adjacency spatial structure\u2019, (Tiefelsdorf, 2000, p. 39, <a href=\"https:\/\/search.worldcat.org\/title\/42476773\" class=\"broken_link\">Worldcat<\/a>) : one can literally consider every region an \u2018ego\u2019, with and its neighbors as \u2018alters\u2019, using common sociometric language (Coleman, Katz, &amp; Herbert, 1966, p. 70, <a href=\"https:\/\/search.worldcat.org\/title\/565386\">Worldcat<\/a>) . <a href=\"#_edn9\" name=\"_ednref9\"><span>[ix]<\/span><\/a> I note that the spatial dimension can be utilized as an ordering criterion either in a sociometric spatial manner, i.e. a region connected to its natural adjacent neighboring regions, or continuously, by marking a region\u2019s location as latitude and longitude (a third dimension would rarely be needed analytically); geographically weighted regressions is a common solution in this continuous space context (Brunsdon, Fotheringham, &amp; Charlton, 1996) <a href=\"https:\/\/www.jstor.org\/stable\/2988625\" class=\"broken_link\">Jstor<\/a> , vs. the polygon boundary related areas I consider here.<\/p>\n<p>With these clustering structures present, spatial areal data at more than one regional level require statistical methods that can model both the multilevel and the spatial non-independence structures; multilevel spatial path analysis (or SEM, if latent variables are modeled) is such a natural option: no <em>spatial<\/em> multilevel models have been published, to our knowledge.<\/p>\n<p>Properly<strong> accounting for<\/strong> the \u2018<strong>mutual influence<\/strong>\u2019 induced by spatial structure requires the specification and estimation of <em>spatial<\/em> effects. The spatial econometrics literature has provided two handy tools, the spatial lag and spatial error regressions (Anselin, 1988), <a href=\"https:\/\/link.springer.com\/book\/10.1007\/978-94-015-7799-1\">Springer<\/a> \u00a0\u00a0p. 22, wherein one adds an additional element in regression models that accounts for the spatial structure, either as a co-predictor, or contained in the residual errors. The spatial lag method is a simple approach which combines the neighbors of each case (e.g. county) in the data into a global \u2018other\u2019, by creating an average score of all the neighbors\u2019 values for each outcome; I posted an Excel setup for doing so by hand for a dataset of the 49 contiguous US states at <a href=\"http:\/\/tinyurl.com\/SPATIALPRPR\">tinyurl.com\/SPATIALPRPR<\/a> (explained at <a href=\"https:\/\/tinyurl.com\/BLOGSTATS3\">Tinyurl.com\/BLOGSTATS3<\/a> : the spatial lag is simply a multiplication of two matrices: a column vector of the original variable, and the weights matrix of \u2018who is whose neighbor\u2019 (Anselin, 1988, <a href=\"https:\/\/link.springer.com\/book\/10.1007\/978-94-015-7799-1\">Springer<\/a>, p. 23).<\/p>\n<p>The spatial lag correction opens up a better <strong>modeling intuition<\/strong> for the extent of the bias induced by na\u00efve\/a-spatial analyses, coming from a parallel graphical method. Sewall Wright\u2019s &gt;100 years old path analytic invention decomposed associations into causal and non-causal components; it has been expanded to provide a formal causal calculus (Pearl, 2017, <a href=\"ftp:\/\/ftp.cs.ucla.edu\/stat_ser\/r459.pdf\">Ucla.Edu<\/a>) . One simple method to account for the spatial nonindependence is to include the outcome&#8217;s (first) &#8216;spatial lag\u2019 as a second predictor (Rey &amp; Boarnet, 1999, <a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/978-3-662-05617-2_5\">Springer<\/a>) . This \u2018<strong>self<\/strong>\u2019 (i.e. same variable) spatial lag captures the influence of neighboring regions on each region, often resulting in attenuating the primary (na\u00efve) effect of interest. Classic proofs going back to 1946 (Cramer, 1946, <a href=\"https:\/\/search.worldcat.org\/title\/528245\">Worldcat<\/a> ), cited in (Pearl, 2017, <a href=\"ftp:\/\/ftp.cs.ucla.edu\/stat_ser\/r459.pdf\">Ucla.Edu<\/a>) show by how much, if one ignores a second predictor (which correlates with the first predictor), the estimate of the effect is biased (and the corrected effect can even change signs: &gt;0 vs. &lt; 0). The extent of the change (or bias) can be directly gauged visually, \u2018walking through\u2019 the graphical path model: the (na\u00efve) correlation, in the two predictor model, is now the result of two pathways: a direct effect, and an indirect connection through the second predictor: the \u2018tracing rule\u2019 (proved incidentally by Kiiveri et al., 1984 <a href=\"https:\/\/www.cambridge.org\/core\/journals\/journal-of-the-australian-mathematical-society\/article\/recursive-causal-models\/5AE56E9EA98DCC06E951DD8F5A57E033\">CambridgeJournals<\/a> , and Moran, 1961) <a href=\"https:\/\/arc.aiaa.org\/doi\/pdf\/10.2514\/8.9194\">Aiaa.Org<\/a> ) yields this result right away. <a href=\"#_edn10\" name=\"_ednref10\"><span>[x]<\/span><\/a><\/p>\n<p>I illustrate the CDC\u2019s Life Expectancy<a href=\"#_edn11\" name=\"_ednref11\"><span>[xi]<\/span><\/a> online data with census tract, county and state level Life Expectancy at Birth 2010-2015 layers, and variables from the CDC Social Vulnerability Index online data, available for census tracts and counties (Centers for Disease Control, 2024), which contains income, percent minority, percent poverty, and other fields. The data and Stata and Mplus syntax are posted online (E. Coman, 2024, <a href=\"https:\/\/doi.org\/10.7910\/DVN\/OWJGXR\">Dataverse.Harvard<\/a>) ; direct link <a href=\"https:\/\/tinyurl.com\/SPATIALPRPR\">Tinyurl.com\/SPATIALPRPR<\/a> ).<\/p>\n<p>When <strong>Moran I\u2019s are sizeable<\/strong> (.4 or larger) and statistically significantly different from zero, according to pseudo-p values, which are random permutations of the observed values over the locations (Anselin, 10\/12\/2020) <a href=\"https:\/\/geodacenter.github.io\/workbook\/5a_global_auto\/lab5a.html\">Geodacenter<\/a>, this indicates the need to correct for the spatial \u2018contagion\u2019 effects. The means of each variable shift, as expected, mainly because of how each higher-level region aggregates lower-level ones, in terms of different populations of residents. The ranges also differ across census tracts, counties or states, with larger regions showing smaller ranges of values. Notably, these means do not also represent the US population as a whole, but the regions they summarize.<\/p>\n<p><strong>Table 1<\/strong> presents the <strong>na\u00efve associations<\/strong> between the three main spatial variables, %Minority and Life Expectancy, across census tracts and counties, respectively (state-level relations were less meaningful), along with <strong>spatial associations<\/strong> derived from spatial lag path models shown in Figure 3: the correlation between the spatial variables adjusted for their auto-correlations. I note that there is little agreement on how to conceptualize a non-directional spatial coefficient of association between two spatial variables (but see Lee, 2001 <a href=\"https:\/\/link.springer.com\/article\/10.1007\/s101090100064\">Springer<\/a> , or Tj\u00f8stheim, 1978 <a href=\"https:\/\/academic.oup.com\/biomet\/article-abstract\/65\/1\/109\/247096\" class=\"broken_link\">Oup<\/a> , implemented through the cor.spatial function in the R package SpatialPack, (Osorio &amp; Vallejos, 2019, <a href=\"https:\/\/cran.r-project.org\/web\/packages\/SpatialPack\/index.html\">R-project<\/a>)<\/p>\n<p><strong>Table 1. <\/strong>Zero-order na\u00efve\/a-spatial Pearson correlations and standardized spatial associations between %Minority and Life Expectancy; <em>census tract <\/em>and<em> county <\/em>levels<\/p>\n<p><a href=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-300x33.png\" alt=\"\" width=\"547\" height=\"60\" class=\" wp-image-121 aligncenter\" srcset=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-300x33.png 300w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-1024x114.png 1024w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-768x85.png 768w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-1536x171.png 1536w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.table1_-2048x228.png 2048w\" sizes=\"(max-width: 547px) 100vw, 547px\" \/><\/a><\/p>\n<p>Notes: N<sub>CsTr<\/sub> = 60,609 census tracts and N<sub>Cnty<\/sub> = 3,055 counties; spatial lag correlation coefficients are residual correlations between the residual errors of each variable, after its own lag is entered as its predictor. The models are simply: SpatialLag (Y<sub>1<\/sub>) -&gt; Y<sub>1<\/sub> &lt;- Residual (Y<sub>1<\/sub>)<sub>\u00a0\u00a0 <\/sub>&lt;-SpatialAssociation (Y<sub>1<\/sub>Y<sub>2<\/sub>)-&gt;\u00a0 SpatialLag (Y<sub>2<\/sub>) -&gt; Y<sub>2<\/sub> &lt;- Residual (Y<sub>2<\/sub>) (detailed in Figure 3)<\/p>\n<p><strong>Pearson correlations<\/strong> estimates are <strong>inflated<\/strong> compared to their spatial counterparts.<\/p>\n<p>While such <strong>discrepancies<\/strong> appear small, they are more <strong>consequential<\/strong> when spatial data is used to investigate possible measurement structures of constructs, like social vulnerability (Karaye &amp; Horney, 2020) <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0749379720302592\" class=\"broken_link\">ScienceDirect<\/a> or structural racism (Lukachko, Hatzenbuehler, &amp; Keyes, 2014) <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0277953613004206\" class=\"broken_link\">ScienceDirect<\/a> . There are also shifts (more evident at state-scale, however) in statistical significance levels of estimates of effects, not seen at census tract or county level, where analyses use large samples, over 60,000 census tracts and over 3,000 counties, which again cover the <em>entire population<\/em> of such regions. On the other hand, the census tract and county estimates from their respective spatial lag models are refreshingly similar (or stable across geographic scale (Marston, 2000, <a href=\"https:\/\/www.researchgate.net\/publication\/240335845_The_Social_Construction_of_Scale\">ResearchGate<\/a>)<\/p>\n<p><strong>Fig. 2.<\/strong> Spatial lag path model to derive a spatial association<\/p>\n<p><a href=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig2_.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig2_-300x89.png\" alt=\"\" width=\"455\" height=\"135\" class=\" wp-image-120 aligncenter\" srcset=\"https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig2_-300x89.png 300w, https:\/\/health.uconn.edu\/causality\/wp-content\/uploads\/sites\/264\/2026\/02\/6.fig2_.png 466w\" sizes=\"(max-width: 455px) 100vw, 455px\" \/><\/a><\/p>\n<p><em>Notes<\/em>: Dotted lines are paths set to unity (=1); double lines are the \u2018self\u2019 spatial lag effects, same variable.<\/p>\n<p>The <strong>relationships<\/strong> between the same pair of variables <strong>differed by level<\/strong>, e.g. %Non-White &lt;-&gt; Life expectancy, county vs. census tract level. Plus, each can directly cause each other, even if one direction might be predominant, i.e. a variable is a \u201cgreater cause of the other\u201d(Kenny &amp; Harackiewicz, 1979), p. 377. <a href=\"https:\/\/www.academia.edu\/115034803\/Cross_lagged_panel_correlation_Practice_and_promise\">Academia.Edu<\/a>) . Because the data literally doubles by generating spatial lag derivations, path models allowing for both direct effects (in both directions) can be easily estimated by conveniently utilizing the spatial lag of each variable as a \u2018natural\u2019 instrumental variables (IV, (Maydeu-Olivares, Shi, &amp; Rosseel, 2019. <a href=\"https:\/\/biblio.ugent.be\/publication\/8642364\">UGent<\/a>) , we include in the online Appendix <a href=\"http:\/\/tinyurl.com\/SPATIALPRPR\">tinyurl.com\/SPATIALPRPR<\/a> illustrations of such cyclical models).<\/p>\n<p><strong><em>Conclusions <\/em><\/strong>The spatial lag expansion of the regular regression\/path analysis-based modeling I showed opens up exciting new analytical prospects. The combination of spatial analytics and the classical multilevel models invites better clarifications of what \u2018clustering\u2019 and non-independence mean, and how they happen. One can for example investigate census-tract level effects of percent minority unto life expectancy, while taking into account the \u2018nesting\u2019 of census tracts within the \u2018higher-level\u2019 counties. This option brings about the second type of data nonindependence, which is a combination of the classic spatial \u2018auto\u2019-correlation, and the mere \u2018belonging to a group\u2019 type of multilevel clustering: census tracts belong to the same county both by virtue of spatially \u2018sitting\u2019 in a larger region. One can moreover examine separately the extent of this 2-level clustering by assessing the intra-class correlation (ICC, (Haggard, 1958. <a href=\"https:\/\/search.worldcat.org\/title\/242480\">Worldcat<\/a>) , which is a distinct measure of nonindependence from the Moran\u2019s I. ICCs can be generated also from proper spatial (linear mixed, or SEM multilevel) models.<\/p>\n<p>Tests of the lower-level census tract effects of percent minority on life expectancy can be modeled also as higher level (county, e.g.) random coefficients, at which level one can also examine whether county-level predictors, like poverty, predict variability in this effect itself, across counties: all this while properly accounting for the spatial structure inherent in the regional\/geographic data. One can directly also test for example <strong>structural and systemic effects<\/strong>, like higher-level (e.g. state or county) factors affecting lower level (census tract) indicators or effects, net of lower-level effects. An example would be county level political leaning of legislating bodies impacting the average level of both life expectancy, and the size of the \u2018percent minority \u00e0 life expectancy\u2019 effect manifested across lower-level census tracts. Such models expand the scope of health inequality inquiries to consider also macroeconomic policy implications (Mitchell, 2001. <a href=\"https:\/\/eprints.gla.ac.uk\/22158\/\">Gla.ac.uk<\/a>) . Modeling both spatial and temporal \u2018auto\u2019-correlations is another exciting extension. A dyadic view of the data can be also taken, recognizing that all spatial variables are quite like a \u2018self\u2019 variables, while their spatial lags appear in the data as a second set of \u2018the other\u2019 variables (Wickham, 2023. <a href=\"https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/10705511.2022.2159410\">SEM<\/a>j) .<\/p>\n<p>A note of <strong>caution<\/strong> is needed: statements derived from analyses of regional (areal) data alone (without individual-level resident data) require a higher burden of proof to be stated as cause-effect findings. An analysis of differences in racial\/minority composition and say mortality due to cardiovascular diseases across US counties could not be stretched to conclude that one\u2019s race\/ethnicity <strong>causes<\/strong> mortality per se, but that such <em>regional<\/em> mortality rates differences do exists, and go hand-in-hand with regional racial\/ethnic composition differences.<\/p>\n<p><em>Further extensions\u00a0<\/em>Note that diversity itself differs conceptually from segregation, and some have found some \u2018benefits\u2019 of segregation, i.e. a longer expected longevity for residents in segregated areas (Chetty et al., 2016, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/27063997\/\">Pubmed<\/a>) . Similarly, income and income inequality (e.g. Gini income inequality coefficient (Ceriani &amp; Verme, 2012, <a href=\"https:\/\/link.springer.com\/article\/10.1007\/s10888-011-9188-x\">Springer<\/a> ), or relative income (Wilkinson, 1992, p. 168, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/1637372\/\">Pubmed<\/a>) , have distinct effects on health (Kawachi &amp; Kennedy, 1999, <a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC1088996\/\">Pubmed<\/a>) , and further investigation could examine their combined effect on life expectancy. Secondly, while I examined the role of income, I have not modeled income inequality per se separately as another potential factor. \u2018Income inequality\u2019 might emerge as a predictor of differences in health outcomes and life expectancies, so well-aimed public policies might be different from mere increases in minimum wages, but can involve tax and redistribution policies (Avendano &amp; Kawachi, 2014, <a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC4112220\/\">Pubmed<\/a>) . Correspondingly, disparities or inequities in poverty can be derived and examined as determinants, like the difference in the rates of poverty among White residents vs. those among non-White residents.<\/p>\n<p>The <strong>interplay between income and povert<\/strong>y in affecting health outcomes, and to what extent they overlap, is actively researched (Khullar &amp; Chokshi, October 4, 2018, <a href=\"https:\/\/www.healthaffairs.org\/content\/briefs\/health-income-poverty-we-could-help\">HealthAffairs<\/a>) , but needs further investigations. A wider causal model of inter-relations between regional indicators like income and poverty, and health outcomes, on the other hand, may be positioning income for example on different causal pathways leading to better or worse health, which can include additional structural, systemic, and institutional racism potential factors, as well as what is increasingly termed \u2018political determinants of health\u2019 (Mackenbach, 2013, <a href=\"https:\/\/pure.eur.nl\/files\/47555434\/Mackenbach-JP.-Political-determinants-of-health-editorial-.-Eur-J-Public-Health.-2014-Feb-24-1-2.pdf\" class=\"broken_link\">Eur.nl<\/a> )<\/p>\n<p>While analyses like the spatial lag models can provide answers for potential region-based interventions aimed at modifying living conditions, the statistic answer to \u2018how much this census tract\u2019s (or county\u2019s) average life expectancy might increase if one improved its median income by a specific amount\u2019 would not translate into <strong>real-life changes<\/strong>, as the processes underlying regional differences are not merely occurring at the region level, but are complex social and economical processes that involve individual residents, neighborhoods, economic agents, etc. (Chetty, Hendren, Kline, &amp; Saez, 2014, <a href=\"https:\/\/www.nber.org\/papers\/w19843\">Nber.org<\/a>; Mitchell, 2001; Wilkinson, 1992, <a href=\"https:\/\/eprints.gla.ac.uk\/22158\/\">Gla.ac.uk<\/a> \u00a0\u00a0<a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/1637372\/\">Pubmed<\/a>) \u2018People poverty\u2019 for instance is distinct from \u2018place poverty\u2019, which reflects more \u2018public necessities\u201d than individual paucity (Martin &amp; Morrison, 2003), p. 246) <a href=\"https:\/\/search.worldcat.org\/title\/56615111\" class=\"broken_link\">Worldcat<\/a> , and they might even be inversely related (Powell, Boyne, &amp; Ashworth, 2001) <a href=\"https:\/\/bristoluniversitypressdigital.com\/view\/journals\/pp\/29\/3\/article-p243.pdf\">BristolU<\/a> . Moreover, different factors might be responsible for differences between countries,\u00a0 (e.g. differences in health care, individual behaviors, socioeconomic inequalities, and the built physical environment (Avendano &amp; Kawachi, 2014, <a href=\"https:\/\/pmc.ncbi.nlm.nih.gov\/articles\/PMC4112220\/\">Pubmed<\/a>), vs. lower-level regions, since geographies are \u201da nested hierarchy of differentially sized and bounded spaces\u201d (Marston, Jones III, &amp; Woodward, 2005, <a href=\"https:\/\/www.researchgate.net\/publication\/237405538_Human_Geography_Without_Scale\">ResearchGate<\/a>)<\/p>\n<p>The extent to which regions represent also distinct <strong>administrative boundaries<\/strong> (like counties), with local governing structures able to affect residents\u2019 health though localized policies, or mere geographic areal demarcations (like census tracts, or ZIP code tabulation areas, ZCTAs), will also have a say in how strong the \u2018effects\u2019 would be across such geographies; this relates to the \u2018modifiable areal unit problem\u2019 (MAUP, (Openshaw &amp; Taylor, 1979, <a href=\"https:\/\/www.semanticscholar.org\/paper\/A-million-or-so-correlation-coefficients-%3A-three-on-Openshaw\/c403fc42f88698d144a2729b1904b95f934e4ae1\">SemanticScholar<\/a> ) \u00a0which can be compounded by areal misalignment, i.e. when one same smaller region \u2018spills over\u2019 more than one larger region (Zhukov, Byers, Davidson, &amp; Kollman, 2023, <a href=\"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/0EB1F25861F9CAF940D6DB07333C8345\/S1047198723000050a.pdf\/integrating_data_across_misaligned_spatial_units.pdf\">Cambridge<\/a>). . Regions can also be literally modified, thereby concentrating within or dividing social groups across politically drawn boundaries, with effects on health (Rushovich, Nethery, White, &amp; Krieger, 2024, <a href=\"https:\/\/www.researchgate.net\/publication\/383461773_Gerrymandering_and_the_Packing_and_Cracking_of_Medical_Uninsurance_Rates_in_the_United_States?_tp=eyJjb250ZXh0Ijp7InBhZ2UiOiJzY2llbnRpZmljQ29udHJpYnV0aW9ucyIsInByZXZpb3VzUGFnZSI6bnVsbCwic3ViUGFnZSI6bnVsbH19\">ResearchGate<\/a>) . School districts for instance in the US are rather loose such regional units, yet due to local education boards\u2019 setup and administrative powers over the schools in their area, will directly affect some health aspects of school-age residents, e.g. by accepting or rejecting free-school lunch offers from the US federal government during Covid-19 (Kashyap &amp; Jablonski, 2024 <a href=\"https:\/\/onlinelibrary.wiley.com\/doi\/full\/10.1002\/aepp.13460\">Wiley<\/a>) .<\/p>\n<p>*** The last part, # 7, will briefly go over some remaining challenges and opportunities for both advancing this field, and for better explaining it, like the \u2018equivalence of potential outcomes (\u2018Rubin\u2019, more properly Cochran\u2019s\u2026 see note viii below and image insert) and causal calculus (Pearl) approaches to causality\u2019.<\/p>\n<p><span style=\"text-decoration: underline\">ENDNOTES<\/span><\/p>\n<p><a href=\"#_ednref1\" name=\"_edn1\"><span>[i]<\/span><\/a>\u00a0 \u201cThe number of paupers in one area may well be affected by relief policy in neighboring areas. Such issues are not resolved by the data analysis\u201d ((Freedman, 2005), p. 1063, <a href=\"https:\/\/sci-hub.st\/https:\/onlinelibrary.wiley.com\/doi\/abs\/10.1002\/0470013192.bsa598\" class=\"broken_link\">2Wiley<\/a> ).\u00a0Several methods for analyzing spatial (or geo-referenced (Vallejos, Osorio, &amp; Bevilacqua, 2020, \u00a0<a href=\"https:\/\/search.worldcat.org\/title\/1198374628\">Worldcat<\/a> ) \u00a0data exist (a taxonomy is in (Anselin, 1988), p. 32); a simple spatial correction model is the Cliff and Ord\u2019s (Cliff &amp; Ord, 1973, <a href=\"https:\/\/search.worldcat.org\/title\/7835738\" class=\"broken_link\">Worldcat<\/a> ) \u00a0spatial \u2018autoregressive\u2019 model (also called the spatial Durbin error model, (Anselin, 1988)), which is informed by Whittle\u2019s two-dimensional linear autoregression (Whittle, 1954, <a href=\"https:\/\/www.jstor.org\/stable\/2332724\" class=\"broken_link\">Jstor<\/a> ), see (Kelejian &amp; Prucha, 2004) <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0304407603001337\" class=\"broken_link\">ScienceDirect<\/a> ). Despite the availability of such tools for correcting for spatial nonindependence of data (some free, like (Anselin, 2021, ) , where this analysis is labeled spatial lag regression), there are rare reports of proper spatial analyses of geographic\/regional data. Prompted by the increasing availability of such aggregated regional data (e.g. life expectancy (National Center for Health Statistics, 2020, <a href=\"https:\/\/nap.nationalacademies.org\/read\/13089\">NationalAcademies<\/a> ), a slew of research reports have found associations between purportedly causally remote constructs, like redlining (as indicator of historic structural racism) and diabetes prevalence in modern times (Egede, Walker, Campbell, &amp; Linde, 2024, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/38387079\/\">Pubmed<\/a> ).<\/p>\n<p><a href=\"#_ednref2\" name=\"_edn2\"><span>[ii]<\/span><\/a> Location information is not always collected and processed, however, and at times it is not essential to research inquiries. Moreover, health-relevant data collected from people are always to some extent non-independent: at the largest scale, all humans breathe (more or less) \u2018the same air\u2019 from the same atmosphere, so there is a \u2018clustering\u2019 due to such common living conditions. Depending on what outcome is investigated and who is recruited, or how data is assembled, however, the impact of such \u2018excess similarities\u2019 ranges from somewhat ignorable, to analytically problematic.<\/p>\n<p><a href=\"#_ednref3\" name=\"_edn3\"><span>[iii]<\/span><\/a> Classical analytical methods had to be adapted to handle data that are \u2018correlated\u2019 (e.g. the chi-squared test (Cerioli, 1997 <a href=\"https:\/\/www.jstor.org\/stable\/2533962\" class=\"broken_link\">Jstor<\/a>), and such adaptations have a long history (Haggard, 1958, <a href=\"https:\/\/search.worldcat.org\/title\/242480\">Worldcat<\/a> \u00a0p. 5); in social psychological research the topic has been extensively handled under the \u2018nonindependence\u2019 framework (see e.g. (Kenny, Ackerman, &amp; Kashy, 2024 <a href=\"https:\/\/www.researchgate.net\/publication\/387568422_The_Design_and_Analysis_of_Data_from_Dyads_and_Groups\">ResearchGate<\/a>; Kenny &amp; La Voie, 1985 <a href=\"https:\/\/www.researchgate.net\/publication\/232518916_Separating_Individual_and_Group_Effects\">ResearchGate<\/a>) . Comparing the similarity within groups to that between the groups is the classic logic of analyzing differences in averages, and variability sources (the analysis of variance method, (Kenny, 1987, p. 224 <a href=\"https:\/\/davidakenny.net\/doc\/statbook\/kenny87.pdf\">DavidAKenny.net<\/a>) . The sources of \u2018excess\u2019 similarity in spatial analytics however differ, because of the natural spatial structure adds a distinct \u201ccontagion\u201d in the data.<\/p>\n<p><span>[iii]<\/span> When data is collected from a sample of patients twice, for example, applying an \u2018independent samples\u2019 t-test will show different results compared to the proper companion paired t-test (Coman et al., 2013, <a href=\"https:\/\/www.frontiersin.org\/journals\/psychology\/articles\/10.3389\/fpsyg.2013.00738\/full\">Frontiers<\/a>) . Similarly, a test comparing life expectancy means, say between US Northern vs. the Southern states, would need to be adjusted to account for the inherent \u2018spatial pairing\u2019 of cases. This differs from temporal \u2018pairing\u2019: whereas time indexes a natural linear pairing, such that an observation is more similar to its prior (and subsequent) temporal pair, space induces such similarities across <em>several<\/em> \u2018prior\u2019\/lagged\/near-by regions, the spatially adjacent ones.<\/p>\n<p><a href=\"#_ednref4\" name=\"_edn4\"><span>[iv]<\/span><\/a> When data is collected from a sample of patients twice, for example, applying an \u2018independent samples\u2019 t-test will show different results compared to the proper companion paired t-test (Coman et al., 2013, <a href=\"https:\/\/www.frontiersin.org\/journals\/psychology\/articles\/10.3389\/fpsyg.2013.00738\/full\">Frontiers<\/a>). Similarly, a test comparing life expectancy means, say between US Northern vs. the Southern states, would need to be adjusted to account for the inherent \u2018spatial pairing\u2019 of cases. This differs from temporal \u2018pairing\u2019: whereas time indexes a natural linear pairing, such that an observation is more similar to its prior (and subsequent) temporal pair, space induces such similarities across <em>several<\/em> \u2018prior\u2019\/lagged\/near-by regions, the spatially adjacent ones.<\/p>\n<p><a href=\"#_ednref5\" name=\"_edn5\"><span>[v]<\/span><\/a> Economists call it more properly \u201ccorrelated observations\u201d E(yi, yj) \u2260 0 ((Cameron &amp; Trivedi, 2009, <a href=\"https:\/\/www.stata.com\/bookstore\/microeconometrics-stata\">Stata.Com<\/a> \u00a0p. 81): it induces an excess resemblance, or similarity, compared to plain randomness, in the data. If the life expectancy at birth in one state does not provide any information about the life expectancy in its neighboring states, there is no spatial autocorrelation: the regions are independent. Yet, when the life expectancy of one state does influence or constrain our expectations for neighboring states (reduces the number of possibilities, or the \u2018sample space\u2019, (Conover, 1999, <a href=\"https:\/\/search.worldcat.org\/title\/39261809\">Worldcat<\/a>), &#8211; making predictions more precise &#8211; the regions are no longer independent.<\/p>\n<p><a href=\"#_ednref6\" name=\"_edn6\"><span>[vi]<\/span><\/a> When cases are not equally probable (P<strong><em><sub>i<\/sub><\/em><\/strong> \u2260 1 \/ N), each case contributes more\/less to the overall expected value. Two more layers add to the conundrum, making even the simple <em>average<\/em> of regional or areal aggregate values, like life expectancy across census tracts, less meaningful: <strong><em>(i).<\/em><\/strong> Populations differ across regions; and more importantly for our illustration; <strong><em>(ii).<\/em><\/strong> The value of one region depends on the values of its neighbors. <strong><em>(i).<\/em> <\/strong>The population-size comparability issue is visible if one considers a region with say one resident expected to live for 95 years, and another with 100 residents expected to live for 85 years: the two region arithmetic average of 90 years clearly does not represent the 101 residents collective population: the proper solution is instead a population-weighted average, here simply (95*1 + 85*100) \/ 101, or 85.1 years. This particular issue is visible when comparing the arithmetic means of the same variable across different levels: states, counties, and census tracts: these values will differ. <strong>(ii). <\/strong>The classic arithmetic mean formula for the mean is in fact an outcome of the \u2018expectation\u2019 approach to finding the typical value, in which when each individual data point is equally likely (P<em><sub>i<\/sub><\/em> = 1 \/ N), the sum of the products of the variable values (Y<sub>i<\/sub>) and the probability of each value (P<em><sub>i<\/sub><\/em>) falls back on the common arithmetic formula\u00a0 \u00a0= \u03a3 Y<em><sub>i<\/sub> <\/em>\u00a0* P<em><sub>i<\/sub> <\/em>= \u03a3 (Y<em><sub> i<\/sub><\/em> * 1) \/N.<\/p>\n<p><a href=\"#_ednref7\" name=\"_edn7\"><span>[vii]<\/span><\/a> This \u2018loss of information\u2019 is not commonly visible, because life expectancy data at US states level for example <em>appear<\/em> \u2018complete\u2019: each state has a value. The \u2018missing data\u2019 component is made visible when the data structure of the complete geographic data is revealed: such data contain in fact a set of distinct files, collectively called a \u2018shape\u2019 file. Some analytical software make this evident by specifically providing ways of re-integrating attribute data files (like life expectancy per state) with their geographic \u2018twin\u2019 (the shape file): Stata\u2019s sp analytic module is such a tool, as is R\u2019s terralib, e.g. (Bivand, Pebesma, Gomez-Rubio, &amp; Pebesma, 2008, \u00a0<a href=\"https:\/\/search.worldcat.org\/title\/851473724\">Worldcat<\/a> \u00a0\u00a0p. 51), besides mapping software dedicated to such data structures, like GeoDa, QGIS, or ArcGIS.<\/p>\n<p><a href=\"#_ednref8\" name=\"_edn8\"><span>[viii]<\/span><\/a> Consider a simplified example where each state\u2019s life expectancy is classified as either \u2018high\u2019 (1, e.g. higher than the national mean) or \u2018low\u2019. In this case, a table indicating which states are immediate neighbors, which helps define neighboring relationships among states, is the typical spatial weights matrix (a \u2018spatial link matrix,\u2019 (Tiefelsdorf, 2000, <a href=\"https:\/\/search.worldcat.org\/title\/42476773\" class=\"broken_link\">Worldcat<\/a> p. 38). If we tried to randomly assign 1\u2019s and 0\u2019s across all US states, and start with 1 for say Connecticut (CT), its three US neighbors (Massachusetts, New York, and Rhode Island) should get some random combination, like (1,1,0), whereas (1,1,1) or (0,0,0) would indicate \u2018perfect\u2019 \u2018auto-correlation. NY neighbors MA too, and it should have such a random combination around it (under independence), but assigning NY neighbors\u2019 random values conflicts now with the choice already made for CT. This interdependence, driven by spatial proximity, makes it (nearly) impossible to simulate perfect randomness across a map. This phenomenon underpins the classic \u2018map coloring problem\u2019 (Parks, 2012, <a href=\"https:\/\/oro.open.ac.uk\/54663\/\">Oro.Open<\/a> and explains why we cannot generate \u2018completely random\u2019 spatial values.<\/p>\n<p><a href=\"#_ednref9\" name=\"_edn9\"><span>[ix]<\/span><\/a> More appropriately even, we can talk about many \u2018one-to-many\u2019 (Kenny, 1994) <a href=\"https:\/\/search.worldcat.org\/title\/30035276\">Worldcat<\/a> \u00a0\u00a0relations, in which each region is both an \u2018ego\u2019, and an alter too, for as many of its neighbors (Hagood, 1943, <a href=\"https:\/\/www.jstor.org\/stable\/2570665\" class=\"broken_link\">Jstor<\/a> . With two clustering structures present, spatial areal data at more than one regional level require statistical methods that can model both the multilevel and the spatial non-independence structures; multilevel spatial path analysis (or SEM, if latent variables are modeled) is such a natural option: no <em>spatial<\/em> multilevel models have been published, to our knowledge.<\/p>\n<p><a href=\"#_ednref10\" name=\"_edn10\"><span>[x]<\/span><\/a> Note that adjusting for the values of a third variable has another direct intuition: one simply assesses the relation between the two focal variables, at each level of the third, and averages these level-specific effects across the third variable (Pearl, 2009, p. 80, <a href=\"https:\/\/bayes.cs.ucla.edu\/BOOK-2K\/neuberg-review.pdf\">Ucla.Edu<\/a><\/p>\n<p><a href=\"#_ednref11\" name=\"_edn11\"><span>[xi]<\/span><\/a> Life expectancy is calculated from \u2018life tables\u2019 of populations (or regions), which contain number of residents, and number of deaths within age rages. A classical method to derive it is Chiang\u2019s (1968, <a href=\"https:\/\/search.worldcat.org\/title\/371591\">Worldcat.org<\/a>), although other have been proposed (e.g. (Silcocks, Jenner, &amp; Reza, 2001, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/11112949\/\">Pubmed<\/a> , see (Eayres &amp; Williams, 2004)) <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/14966240\/\">Pubmed<\/a> . Life expectancy is the area under the curve from a survival curve (Chetty et al., 2016): in a simple graph of percent alive (or probability of surviving, on the vertical axis) as a function of age (horizontal axis), this area means the average length of life (at a certain age, Modig, Rau, &amp; Ahlbom, 2020, <a href=\"https:\/\/bmjopen.bmj.com\/content\/10\/7\/e035932\">Bmjopen<\/a> . Of course, life expectancy is in itself a collective construct and a future-pointing concept, using today\u2019s data on mortality, to derive likely life span for those born today: it can therefore be largely influenced by temporal events, like the Covid-19 epidemic, which lowered life expectancy in the US by 0.33 years (Yan et al., 2024), <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/37955927\/\">Pubmed<\/a> \u00a0and also widened the gender gap (Hayes &amp; Gupta, 2023, <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/38154830\/\">Pubmed<\/a> .<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Causality in disparities spatial analytics The conundrum and common solutions\u00a0Na\u00efve\/aspatial methods analyzing spatial data overestimate effects because observations inherently have spatial autocorrelation; for instance, life expectancy shows a classic (na\u00efve) Pearson correlation even with the statistically meaningless FIPS census tract identifier (Pubmed),\u00a0 yet a proper spatial lag regression correctly yields no such relation. There is [&hellip;]<\/p>\n","protected":false},"author":2514,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"acf":[],"publishpress_future_action":{"enabled":false,"date":"2026-07-21 09:45:02","action":"change-status","newStatus":"draft","terms":[],"taxonomy":""},"_links":{"self":[{"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/pages\/117"}],"collection":[{"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/users\/2514"}],"replies":[{"embeddable":true,"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/comments?post=117"}],"version-history":[{"count":5,"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/pages\/117\/revisions"}],"predecessor-version":[{"id":127,"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/pages\/117\/revisions\/127"}],"wp:attachment":[{"href":"https:\/\/health.uconn.edu\/causality\/wp-json\/wp\/v2\/media?parent=117"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}