Mapping Global Health: A network analysis of a heterogeneous publication domain

This paper examines one of the most visible but oddly neglected aspects of the rapidly expanding Global Health (GH) enterprise: its vast literature. Basing our data on the PubMed MeSH term “World Health” (changed to “Global Health” in 2015) and utilizing the citation and funding metadata provided by Web of Science, we analyze nearly 20,000 articles using the software platform CorTexT for the automatic processing of large text corpora. We perform several types of scientometric network analyses, and provide maps displaying inter-citations among journals publishing GH articles, co-authorship among the 292 authors who published 12 or more papers, co-citation analysis of works (articles, books, and reports) cited at least 30 times by the papers in our database, and funding sources since 2008. The maps display the social, cognitive, and funding substructure of the GH publication field. We suggest that this somewhat fragmented and fuzzy domain is held together by (1) a core group of authors who have for some time been co-authoring numerous papers and reports with one another; (2) several central journals, most notably the Lancet, addressing wider audiences and transcending the narrow specialization characteristic of scientific and biomedical fields; and (3) a growing body of large-data metrics, most prominently the Global Burden of Disease, which has become a rhetorical resource for numerous groups with different agendas.


Introduction
''Global Health'' (henceforth: GH) has become a ubiquitous term that covers a large, heterogeneous, and rapidly growing set of activities. Foreign assistance for health directed at low-and middle-income countries rose by over 500 percent between 1990 and 2010 when it plateaued. From a handful of university programs in GH before 2000, there are now 153 universities or organizations that are members of the Consortium of Universities for Global Health. 1 All this activity has produced a large number of specialized or local studies examining one or another corner of this domain, but a vision of the whole is remarkably lacking. With the exception of funding patterns that have become clearer due to ongoing studies by the Institute for Health Metrics and Evaluation, 2 we know relatively little about the overall architecture of this growing field (see, however, Hoffman et al, 2015). Definitions by its practitioners (e.g., Szlezá k et al, 2010; Koplan et al, 2009) tend to be short, general, and highly normative. Social scientists, mainly anthropologists at this point, while providing many insights about the GH endeavor, have mostly adopted a strongly critical stance, denouncing the many ills of the GH enterprise including its perceived 'neoliberalism,' 'post-colonialism,' and technological determinism (e.g., Biehl and Petryna, 2013;Farmer et al, 2013). Less normative work tells us a great deal about the numerous issues and diseases associated with the field, including among other subjects tuberculosis (e.g., Gaudillière, 2014), tobacco policy (e.g., Reubi, 2016), its dominant ''regimes'' (Lakoff, 2010), as well as supplying analyses that subtly deconstruct the GH domain (e.g., Fassin, 2012).
We too will not attempt a comprehensive description/analysis of GH in this paper. We will instead undertake a 'second order' analysis -avoiding as much as possible value judgements and pre-determined interpretations -of one of its most visible but oddly neglected components: its vast literature. As a first step in this endeavor, we will provide a working definition of this GH literature and then utilize a semi-quantitative mapping approach to investigate its underlying structures. Such an approach will provide an initial analytical description of this large publication corpus that can serve as a starting point for future qualitative and quantitative analyses.

Establishing the database
The literature on GH is both huge and elusive. For one thing, the term itself is polysemic. An initial attempt to create a database of publications by searching for the term in Web of Science (title, abstract, keywords) got about 10,000 hits. It soon became apparent, however, that well over 20 per cent of these referred to a common category in Quality of Life instruments denoting the general health status of patients. A smaller number were incidental word combinations meaning 'total' and referring to such things as the 'global health' budget of a state or province. Rather than cleaning the database, a task that would have involved a myriad of value judgments about what was and was not a GH publication, we settled on a different strategy.
PubMed has had since 1972 a Medical Subject Headings (MeSH) term called ''World Health.'' This was changed only in early 2015 to ''Global Health'' testifying to the conservative nature and consistency of PubMed's MeSH thesaurus, which for our purposes is a positive quality. The entire collection attached to this term as of March 2015 amounted to over 30,000 publications. It must be noted, however, that except for a necessarily vague definition of the MeSH term, 3 PubMed supplies only partial and perhaps dated information about the inclusion and exclusion criteria used by its classifiers (Bachrach and Charen, 1978;Nelson et al, 2001). But a number of limitations are clear. While this index includes social science references, it focuses predominantly on biomedicine and thus represents a biomedical vision of global health publication. This is not a major problem for our purposes since we focus in many of our analyses on the most published global health authors and the most cited works. Few historical or anthropological works would reach our thresholds of inclusion. More seriously, books and reports (grey literature), both numerous and important in this field, are not noted in this source that covers only periodic literature. This creates a serious gap in our data. Our co-citation analyses partially cover this gap by allowing us to gauge the influence of such works among our periodical sources. The result is less than perfect but our MeSH-based strategy has the distinctive advantage of avoiding our own subjective choice of sources that would most certainly affect final results and that could not be reproduced by other scholars. Whatever the limitations of our PubMed database, it is consistent, transparent, and reproducible. It reflects the GH publication domain as defined by trained indexers producing the most influential thesaurus of biomedical literature.
PubMed, while equipped with a strong MeSH thesaurus, does not include citation information (both citing and cited publications), a key resource for exploring the dynamics of a domain. For this, we need to turn to a database such as Web of Science (WoS), with a relatively weak keyword system, and including fewer biomedical publications (albeit all the most relevant ones as defined by their impact factor). To utilize the benefits of both databases, we located all the PubMed articles that were also included in WoS. This yielded 19,595 texts or nearly two-thirds of the PubMed hits. Authored or co-authored by 39,650 individuals, the proportion of PubMed articles in WoS rises considerably with time, reaching 80 per cent around 2014 (see Figure 1) as WoS expanded the number of biomedical journals it surveys, and the field gained recognition and found its way into an increasing number of mainstream journals.

Basic statistics
Our mapping approach (see below) has the advantage of not reducing figurational complexity (Elias, 1978) to a few statistical indicators. Nonetheless, an initial statistical description can provide us with some insights into the content of the database, and serve simultaneously as quality control. Figure 1, in addition to displaying the relationship between publications listed in PubMed and WoS, clearly illustrates the staggering growth of this body of literature since the late 1990s. Given the relatively small number of publications in the 1980s and 1990s, our focus will be on the post-2000 period. It must be noted that this is a highly unusual body of literature. According to PubMed's own analytical categories, 13.5 per cent of the publications consist of ''editorial'' material. To put this in perspective, other highly normative MeSH categories are ''health policy'' with 8.7 per cent and ''biomedical ethics'' with 6.2 per cent editorial material. Most disease-based categories like tuberculosis and neoplasms come in at between 1 and 2 per cent editorial material. Using a more inclusive definition of the term, WoS categorizes 30 per cent of the world/global health publications that it covers as editorial. 4 That means that in addition to the 13.5 per cent that PubMed classifies as editorial there is another 17 per cent or so whose categorization produces classificatory disagreement. This is not entirely surprising given the normative and advocacy orientation of so much of this literature.
All these articles appeared in well over 1000 journals, with over 800 journals publishing five or more of the papers in the database. The majority were published in general medical, public health, or science journals, with The Lancet in a class by itself, being responsible for 1458 articles or editorials. It was followed by the British Medical Journal with 737. The Bulletin of the WHO (561 articles) and Lancet Infectious Diseases (160) are the only  journals devoted to specifically to GH among the 10 periodicals with the greatest number of articles. 5 The large British role in this literature will continue to be evident as our analysis proceeds.

Mapping platform
In order to analyze the GH database, we used the software platform CorTexT (www. cortext.fr), which comprises algorithms designed to process bibliographic data and to perform several types of scientometric network analyses (Rule et al, 2015;Cointet et al, 2012;Jones et al, 2011). To display these links, CorTexT applies a dynamic positioning algorithm that optimizes the location of all the nodes by minimizing the overall strain in the network.
CorTexT also uses an automatic clustering algorithm to define (and color-code) clusters, i.e., cohesive subsets of the network that provide a high-level, fully bottom-up description of the network. To facilitate interpretation, CorTexT color-codes and adds circles around each cluster. The process of mapping was followed by a detailed, manual inspection of the content of individual clusters and their relationships.

Results and Discussion
Inter-citation Journal inter-citation is the relation established when an article in Journal A cites an article in Journal B. Analysis of inter-citation patterns reveals how closely journals are related based on the journals cited by articles that they publish. A network map of inter-citation connections provides an overall view of the knowledge structure of a field and its subfields. We can thus ask: To what extent do all these articles constitute a coherent scientific domain? Judging by the inter-citation map (Figure 2), the answer is -not very much. There is a large central cluster of journals (including many of those in our top 10 list) surrounded by a number of more disease-specific clusters with relatively few citation links among them. What holds them all together are a number of generalist medical journals in the central general cluster, most notably The Lancet which is richly connected to all but the most outlying of the clusters, as well as JAMA and the BMJ. The infectious/tropical disease cluster is most closely associated with the central cluster (with the Journal of Infectious Diseases playing a visible bridging role), an understandable pattern considering the dominant role such diseases have played in the GH enterprise. Another generalist medical journal, the New England Journal of Medicine, provides more modest citation linkages to several other clusters. It is worth again noting the central importance of European and British publications in the intellectual (or at least publication) development of GH. This is not just true of journals. Of the 46,707 authorial institutional affiliations mentioned in the corpus, non-American English-language institutions loom large. Authors affiliated with the World Health Organization are listed most frequently, over 1601 times. The University of London and its various colleges are mentioned 1317 times, while the London school of Tropical Medicine is listed 635 times. American universities are of course far from absent. Harvard authors are recorded 943 times and several, admittedly geographically dispersed, branches of the University of California system 1075 times. The University of Washington, flush with Gates funding, has 615 mentions. The University of Toronto is not too far behind with 589. Among governmental institutions, authors associated with the Centers for Disease Control account for 585 authorial affiliations. City affiliation catalogued by WoS tells much the same story, with London at 2769 mentions, Geneva at 1959, followed by Boston (1444), New York (1269), and Washington (973).
If one looks at national affiliations of authors, the US is well ahead of other countries with 16,296 out of 46,707 (35 per cent) mentions. This is a significant American presence but hardly predominant. On the other hand, the English-speaking world looms very large indeed. The US is followed by the UK (5401), Canada (2993), and Australia (2461).

Mapping Global Health
Switzerland follows (2439) despite its non-English-language character, but its prominence is largely explained by international institutions like the World Health Organization that are concentrated in the Geneva area.

Collaborative patterns: co-authorship
It is impossible to analyze the authors of nearly 20,000 publications and is also unnecessary, since most authors published less than 10 articles with the vast majority (77 per cent) publishing only one ( Figure 3). The distribution classically follows a power-law. In order to constitute a group of authors large enough to be considered core authors in the field, we have chosen to include all authors with 12 or more publications in this database. This gives us 292 authors collectively producing 3708 publications. Thus, 0.7 per cent of the authors account for 18.9 per cent of the total number of GH articles in the database. At the high end are authors like Christopher Murray with 91 publications, Mario Raviglioni, with 62, Allan Lopez and Richard Horton with 59 each, and Zulfiqar A. Bhutta with 56, all of whom have become prototypically associated with the GH domain, albeit for different reasons. (On prototypical domains see below). Obviously, emphasis on numbers privileges older individuals who have been publishing for some time but it is a reasonable way of identifying a core set of authors who have over time played a disproportionate role in the GH literature. They have exerted influence in other ways as well. This small cadre of authors has received 27 per cent of all citations in the highly cited articles (10 or more citations) in our database. In other words, they are highly cited in the most cited articles.
In order to examine co-authorship patterns, we produced a comprehensive set of cumulative co-authorship maps displaying the collaborative links between authors who published at least 12 papers. The first map, with only a couple of authors, goes back to 1988, and each subsequent map adds one year. This approach (as compared to simply producing maps for a given year or specified period) has the advantage of showing, when one moves from one map to the next, the animation-like concretion of a core set of authors who will contribute to the definition of the domain. It also shows the temporal dynamics of the constitution of the domain, for instance when initially distant clusters establish stronger connections or merge, and when new clusters appear. For space reasons, we only show here the map corresponding to the cumulative map in early 2015 ( Figure 4). By the year 2000, there is considerable co-authorship but one gets no sense of a coherent field. Co-authored papers are about relatively narrow domains like tuberculosis or maternal health, and there is virtually no authorial connection from one field to the next. Things, however, quickly begin to evolve. At first, it is authors in closely associated sectors like the different infectious diseases that begin to co-author articles. By early 2015, plotting co-authorships cumulatively across the entire 15-year period yields a dense network of co-authorships cutting across specific domains.
There seem to be at least three different patterns of co-authorship.
1. Authors doing research in the same domain By far the two densest clusters of authors by early 2015 were the individuals involved in the Global Burden of Disease (GBD) projecta major, domain-defining metrics project designed to provide ''a comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020'' (Murray and Lopez, 1996) -and the smaller group of authors working on mental health. In the first case, the cluster emerged modestly in the early years of the century following the publication of the original GBD in various forms from 1993 to 1997. There was only modest development after Christopher  Other clusters that emerged early and that remain visibly dense are tuberculosis and infant and child health. There are also small clusters that are visible early in the century but which disappear from view (e.g., climate and health, disability) as more and more co-authored publications are added to the database. 2. Authors linked by advocacy Individuals have increasingly come together in groups and consortia in order to advocate for one strategy or another or register complaints about GH politics. In the early 2000s, for instance, specialists in different disciplines co-signed pieces about the problems of the WHO (Binka et al, 2002) or advocating a more intense response to AIDS (Stover et al, 2002). As GH gained in popularity and interest, more and more of these collective articles appeared, spearheaded by such groups as the Lancet NCD [non-communicable diseases] Action Group, the NCD Alliance, and the Disease Control Priorities Project (a joint project of the Fogarty International Center of the US National Institutes of Health, the WHO, and The World Bank, launched in 2001 to deal with policy change). Sometimes authors with rather different policy agendas come together for a specific purpose, while at other times they co-sign articles with authors who have also worked with differently oriented groups. That is why the relatively dense cluster at the top center of the final map has remarkably little thematic coherence. It is held together by the existence of numerous multi-authored works on a variety of topics. 3. One final and critical source of co-authorship is collaborations for metric purposes. The Global Burden of Disease team has not just become denser with time; it has actively sought collaboration with other specialty clusters for whom its data are relevant. This advances both the role and credibility of GBD within the GH field but also Christopher Murray's apparently insatiable thirst for more and better data. Partner groups get information they need in order to develop and advocate for programs and justify demands for increased resources. It has in fact become something of a cliché for articles on virtually any disease to begin with a formulaic statement that this disease is or is becoming a major GH burden or problem or crisis. Clusters like the one devoted to mental health that demand greater resources have over the years developed extensive ties with GBD authors because the GBD appears to make visible the great burden of mental illness throughout the world. More diffuse ties link the GBD and child health clusters. Such cross cutting articles frequently involve especially large groups of authors coming from or expert in different geographical regions.
Overall, 6 per cent of the papers in our sample were produced by groups of 10 or more authors. As shown in Figure 4 (the digital versions of the map allow readers to zoom in and search for individual names), certain individuals play a key role as structural nodes or bridges among clusters. Somnath Cjatterji, for instance, was for several years the main link between the mental health and GBD clusters, and later to a small disability cluster (mental illness causes disabilities). Ziad Memish, an infectious disease expert from Saudi Arabia, seems to have developed links with almost every visible cluster. In sum, it would seem that for the core authors in this domain, GH is not just a convenient umbrella label under which a variety of unrelated authors publish on diverse subjects. It has within a 15-year period become a relatively well-defined and structured collaborative domain, at least with respect to its most prolific authors who co-publish frequently and in recognizable patterns. Can we say the same of the intellectual worlds in which they function, as understood through co-citation patterns?
The cognitive landscape: Co-citations To get a sense of the cognitive landscape guiding the work of our core GH authors, we examined the publications they most frequently cited. We limited ourselves to works cited at least 30 times by all the authors in our database. This gave us 203 cited works. Those most cited by our authors were several early reports and articles on the Global Burden of Disease, followed by the World Bank's influential World Development Report of 1993. 6 Unsurprisingly, the by-now numerous GBD-linked studies are cited frequently since they provide an ongoing source of data useful to many authors. The same is true to a lesser degree for the annual World Health Reports published by the WHO.
Instead of relying on simple statistical indicators such as citation counts, we utilized a more sophisticated method known as co-citation analysis to examine the overall structure of the citation domain. Article A and article B are co-cited if they appear together in the reference list of a subsequent article; the assumption is that co-cited articles are related and of relevance to researchers in that particular domain at that point in time. Maps showing clusters of the most frequently co-cited articles therefore display the cognitive substructure of a field. The co-citation maps we are working with are cumulative, meaning that the cocitations found in our publications are added to the co-citations of earlier periods, with the qualification that some may disappear if their proximity threshold falls under a fixed point (because they are no longer cited together). Aside from avoiding pre-defined periodization that might shape the results, and similar to the cumulative co-authorship maps, the 6 These are not necessarily the most cited papers in our database but only those most cited by the articles in our database.

Mapping Global Health
advantage of this approach is to show how new co-citations (arguably, new subfields or redefinitions of a subfield) are grafted onto the existing ones. Historically speaking, this has the advantage of showing how redefinitions of a given domain do not emerge from nothing, but refer (or do not refer) to an existing structuration of this domain. Although the number of publications in our sample grew during the 1990s, there were only a few co-cited texts in 1997. These in fact were limited to three thin clusters. The first involved the various early versions of the Global Burden of Disease. The second was centered on tuberculosis although one also finds Fenner et al's (1988) lengthy study of smallpox eradication (presumably an inspiration for everyone involved with infectious diseases). The third included a motley series of policy or theoretical statements including Abdul Omran's famous 1971 article on the epidemiologic transition (see Weisz and Olszynko-Gryn, 2010), Walsh and Warren's (1979) statement on selective primary health care, and the World Bank's World Development Report of 1993. On the side of greater equity was Wilkinson (1996) on the effects of economic inequalities on health. Godlee (1994) demanded reform of the WHO. Two co-cited articles on HIV/AIDS signal the beginning of a cluster that would appear in subsequent years.
By 2004, cumulative co-citations suggest the emergence of a real core domain but one that remained highly fragmented, with only weak links among the different clusters. This situation changed quickly. Only a few years later we find a much more closely connected group of clusters, with several notable outliers. Let us start with the co-citation map extending to 2004 (Figure 5). At the center of the map, we see two slightly overlapping clusters (C1 and C2). C1 is held together by empirical survey data, on such issues as cancer (Doll and Peto, 1981;Parkin et al, 1997), mortality due to tobacco (Peto et al, 1992), and effects of blood pressure on mortality (Lewington et al, 2002); clinical epidemiologist Richard Peto looms very large in all these publications. The cluster also includes two WHO World Health Reports (2001 and2004) and articles by several leaders of the GBD (e.g., Murray and Frenk, 2000), which was until 2002 housed at the WHO. Closely connected to C1, C2 is dominated by the classic GBD Studies of 1996 and 1997, but also includes several turn-of-the century World Health Reports (closely connected at that point to the GBD). Not quite so prominent are many of the foundational texts of GH reflecting its various ideological and strategic positions. There is Omran (still highly cited), the Alma Ata statement on Primary Health Care (WHO, 1978), (Walsh and Warren's 1979) statement on Selective Primary Care, the Commission on Health Research for Development (1990) that pointed out the discrepancy between research spending and world population needs, the World Development Report of 1993, and the contentious WHO commission on Macroeconomics of 2001, sometime viewed as the incursion of World Bank economists into the WHO. All in all, these suggest the increasingly central role of World Bank views and strategies on the thinking of our core GH authors. Such views structure debate not only for advocates but also for critics for whom this is the worst kind of 'neo-liberalism.' It is, however, noticeable that links to other clusters remain fairly sparse, indicating that these cocitations are at this point largely programmatic, with little relevance for most disease-based groups. Many of the thin links among clusters are due to the bridging functions of WHO's World Health Reports, general enough to be cited in a variety of contexts.
Surrounding the two central clusters are a number of more or less isolated clusters. Loosely connected to C1, C3 deals with various chronic diseases including asthma, cardiovascular disease, as well as the related International Tobacco Convention, while the even more loosely connected C5 corresponds to the beginnings of a mental health cluster. The central clusters are also connected via two bridge publications -Fenner's book on smallpox eradication and the WHO's World Health Report of 1999 -to C8 and C9, two related clusters dealing, respectively, with emerging infectious diseases and HIV/AIDS, already a central motor for the massive increase in GH funding (Brandt, 2013;Packard, 2016). Both are largely though not exclusively American, with publications of the Centers for Disease Control playing a prominent part. The bridging role of the 1999 WHO report is explained by the fact that Dean Jamison, the lead author of this report (and a close collaborator of Chris Murray and the GBD group) is also lead author of the World Development Report of 1993 to which it is linked, and was Chair of the Institute of

Mapping Global Health
Medicine Committee that published a statement on GH in 1997 that was influential in mobilizing the American government around infectious disease prevention. Finally, we have a number of self-standing, unconnected clusters: C4 devoted to diabetes and to the related issue of nutrition, C6 devoted to medical education for GH, C7 devoted to tuberculosis, and C10 that hardly qualifies as a cluster, as confirmed by the fact that in years to come its various components would gradually migrate to more developed groupings. To sum up, we have a major central component consisting of foundational texts and metrics publications: they structure the field, reaching out, on the one hand, to chronic diseases and mental health, and, on the other, via two bridging publications, to clusters dealing with infectious diseases.
In succeeding years, co-citation clusters grew and developed increasing links among themselves. In fact, by 2008 all the clusters figuring on the map ( Figure 6) are interconnected, even if some are only loosely so. 7 Within each cluster, groupings tend to 7 The map also displays a number of small clusters with only weak links to the major component: C4 is a collection of psychiatric publications, C5 centers on GH education, grown somewhat denser as a result of the increasing popularity of GH on university campuses and the need to develop goals and curricula, and C6 is a small emerging cluster on tropical diseases with a contribution on climate change.
be fairly heterogeneous, seldom devoted clearly to a single theme, suggesting that divisions in this growing domain had not yet rigidified. But they nevertheless display some coherence that would become more evident in succeeding years. The central C1 contains the classics of GBD supplemented by several WHO World Health Reports (mainly from Brundtland's tenure as secretary-general). World Bank influence appears to have grown within this cluster with the World Development Report of 1993 now joined at the cluster margins by a revised and expanded version (Jamison, 2006). Thus, C1 contains many of the core GH documents that critics would describe as 'neo-liberal,' and which critics and supporters alike would characterize as dominated by economic reasoning. In addition to the aforementioned 1993 and 2006 publications directed by Dean Jamison, one finds the WHO Commission on Macroeconomics of 2001 and the WHO World Health Report of 2000 which Chris Murray helped to write, and which is famous or infamous for its ranking of national health systems. C1 is strongly connected to the very dense C3, a metrics cluster made up of surveys and studies dominated by the GBD and to a lesser extent WHO publications. It cites articles and reports that are largely about non-communicable or chronic diseases. Some of these are more generally oriented but nonetheless point to the significance of the NCD problem. C3 includes less frequently co-cited programmatic statements about the NCD problem (C3B). These are part of a vigorous effort to increase GH funding for diseases that appear to be expanding quickly in low-and middle-income countries, including sub-Saharan Africa (see Weisz and Vignola Gagné, 2015;Reubi et al, 2015). Not surprisingly, Omran's Epidemiologic Transition (Omran, 1971) which predicted this development nearly 50 years ago, along with Jamison (2006) that restated his 1993 emphasis on this shift, are major bridges to this cluster, which is in turn linked to the less dense cluster C4, consisting of citations of more medically oriented studies of specific chronic diseases, mostly published in disease-based journals. C1 is also connected, albeit far more loosely, to the counterpart of NCDs, namely infectious diseases: emerging diseases and HIV/AIDS have merged into a single C8 that overlaps with a tuberculosis C7.
As noted, C1 displays some of the classics of GH, but older GH classics have moved to C2: these include the 1978 Alma Ata statement and the Commission on Health Research of 1990. There is also a later article about the 2008 WHO commission report on social determinants of health by Michael Marmot and collaborators, as well as several articles with equity in the title (Victora 2003;Saxena 2007). There are two ways to interpret the C2 configuration. The first is to suggest that what unites many, if not all of the titles is that they represent an alternative to the GBD/World Bank axis by emphasizing equity rather than or in addition to economic efficiency. This trend is the outcome of the post-Bruntland embrace by the WHO of its Alma Ata heritage, meaning the 1978 international ''health for all'' declaration emphasizing the role of primary health care. The core historical statements, the emphasis on social determinants and equity are of a piece with the emphasis on child and maternal health, a domain traditional for UN agencies and which the GBD somewhat deemphasized. A second and not incompatible explanation is that what links them is place of publication: The Lancet. Nearly all the articles in C2 appeared in this publication. Just as the GBD by 2008 had become a major institution with its own core texts, authors, and constituencies, The Lancet had developed its own constituencies and interests, some of which intersect with the GBD and some of which do not. The subjects in this configuration reflect issues that Richard Horton, editor of the journal since 1995 (and vigorously pursuing his own complex agenda), and his authors are interested in: infant and maternal health, social determinants of health, the legacy of Alma Ata. This is not so surprising when one thinks about it. For specialists in many fields, the one general periodical likely to be read and written for was The Lancet, a journal that under Horton has unquestionably become the leading publication in the GH field. It is not unexpected that authors publishing in this journal have tended to disproportionately co-cite articles in this same journal.
Using the 2015 map as our reference (Figure 7), we see that in the years that follow most traditional clusters remained stable although they developed many more connections; indeed, the map now comprises a single, strongly interconnected central component. At the center of this configuration lies C1, which consists of the by-now classic GBD publications. The World Development Report of 1993 is now at the margins of the cluster having been displaced by its successor Jamison (2006) as the major policy statement of this configuration. Although they do not seem to have quite the same impact, WHO annual World Health Reports frequently are on the borders of several clusters and seem to play an important bridging role among them because they are general enough to be relevant to different fields. They serve, in terms coined by sociologists of science, as ''boundary objects'' (Star and Griesemer, 1989). The one cluster that appears to have remained, somewhat surprisingly, relatively isolated is C7 -the infectious disease, HIV/AIDS, and TB cluster, which also appears to have become somewhat outdated with few recent publications. Whether this has to do with medical success in controlling HIV/AIDS and the consequent emphasis on distribution of medications rather than research, or the increasing numbers of chronic disease specialists who are now writing about GH, or both, is not clear. Less surprisingly isolated is the education C5 whose links to the outside are largely due to the bridging work of an article by Frenk et al (2010) and an historical article on the origins of GH in the 1990s (Brown et al, 2006). Given that mental health articles have migrated into the C6 psychiatric cluster, C3 is now strongly oriented toward child/maternal health. C3's connection to the central C1 transits via the C4 equity configuration that remains closely connected to some of the key historical statements of GH including the Alma Ata statement and the key texts of the 1990s, joined by classic works on inequality like Sen (1999) and Wilkinson (1996), now migrated from other clusters. With the addition of the Report on Social Determinant of Health (WHO, 2008), it is hard not to see C4 as at least in part an ideological or strategic counterweight to what was by now clearly the center of gravity of GH co-citations: C1 made up of metrics articles, primarily associated with the GBD and its cost/benefit orientation, and supplemented by other statistical sources like the GLOBOCAN series published by the International Agency for Research on Cancer and several WHO World Health Reports, as well as a number of other contributions reflecting evolving World Bank-inspired economic thinking.
The GBD, in particular, has visibly fed into the extremely dense NCD C2, and to a lesser extent the growing mental health C6, while also maintaining connections with the equity/social determinants C4 (indicating that cost/benefit and equity orientations can and indeed do frequently co-exist side-by-side, with authors moving from one to the other as conditions dictate). The GBD project produces many highly co-cited articles; it is now a major enterprise analyzing a variety of different metrics that are useful to many different constituencies. Furthermore, a growing metric enterprise like this one tends to self-reference earlier material on which it is built. Finally, huge surveys of this sort are not easily replaced by newer versions (nearly 20 years separated the first versions of the GBD from a new version). This means that, like methodological articles, metric papers can maintain high citation numbers far longer than research articles. It would not be an exaggeration to say that since 2007, when Gates Foundation money funded the Institute for Health Metrics Research and Evaluation at the University of Washington, the GBD has been central in holding together the disparate GH publication enterprise. It strongly supports the advocacy claims of several major groups (chronic diseases, mental health), and few authors writing about a disease do not present their subject as a significant GH 'burden.' Even the numerous articles that critique DALYs, a measure of overall disease burden, or the GBD more generally, usually cite the major GBD texts that frame discussions.

Funding
Research funding has attracted increasing attention in recent years, but analyzing it remains a highly problematic exercise (Grassano et al, 2016). Nonetheless, our database can provide us with a first impression of GH research funding. WoS began systematically collecting information about funding in 2008. 8 While utilizing statements of funding sources and acknowledgments in its data, WoS complicates matters by including information contained in statements of conflict of interests; this results in mention of organizations, usually pharmaceutical companies, that paid researchers in the past, through grants, consulting or lecturing fees, editorial aid, and a variety of other functions and perks. This has been noted by at least one group of scientometric researchers (Lewison and Sullivan, 2015) who calculated that such non-direct funding may constitute as much as 50 per cent of WoS funding hits in some domains. We initially considered cleaning up these data but eventually came to the conclusion that this effort was misdirected. Research funding is a complex phenomenon and is not just the result of targeted grants. It results from a dense web of previous grants, relationships with funders, non-specific or even non-financial benefits that allow researchers to interact, publish frequently, and collect yet more grants. One could call this configuration of direct funding, fees, salaries, and perks the 'financial ecology' of research. Consequently, we shall include in our analysis all information that WoS lists as funding, and seek, within the limitations of our data, to make sense of the 'financial ecology' of GH publishing.
We identified 4134 funding institutions in our full database. These appeared in 2177 different articles, in the majority of cases only once or twice. A few appeared before 2008 and were included in our analysis. After 2008, such information is featured with increasing frequency; the number of mentions doubled from 2009 to 2010, suggesting more consistent reporting. The 10 most frequently mentioned institutions include grant agencies, charities, and pharmaceutical companies, namely the US NIH (all institutes) with 429 hits, followed by the Bill and Melinda Gates Foundation (222), the WHO (138), Pfizer (128), the European Union (114), the Wellcome Trust (83), Novartis and Eli Lilly (both 81), Glaxo-Smith Kline (79), and the Australia NHMRC (72).
Things look slightly different if we take account of the WoS ESI (Essential Science Indicators) collection of highly cited papers derived from a more complex series of indicators than mere citation numbers. 9 Simply put, ESI takes into account differences in citation behavior and numbers between different domains. While this is a very reasonable approach, it can be questionable in the case of GH, which is not a WoS recognized research area but, rather, an assemblage of articles from different areas. ESI lists the 550 most cited articles in our database. Since it limits itself to the last 10 years (starting in 2006), it has a much higher proportion of reported funding source -more than half of the articles listed -than the highly cited articles in our own database. Aforementioned caveats aside, the results are provocative. Among the ESI articles, the most frequent funder is BMGF with 73 mentions. The various institutes of NIH follow with 58, the Wellcome Trust 28, WHO 27,Pfizer 24,Novartis 22,Merck 19,and the UK Medical Research Council 16. 8 Sporadic information about previous years appears to have retrospectively been included by WoS. 9 http://esi.webofknowledge.com/help/h_dathic.htm.
It is worth looking in greater detail at the large number of BMGF-funded papers among these highly cited papers. Nine of the 10 most highly cited articles funded by the Foundation are based on large projects to produce metrics and were published in The Lancet. Nearly all of these were produced by one or another arm of the Global Burden of Disease project. The three most highly cited papers, with close to 6000 citations among them, report on aspects of the Global Burden of Disease Study of 2010 as does the paper ranked sixth in citations. They are signed by a very large number of authors and the words ''Funding Bill and Melinda Gates Foundation'' appear in bold following the summary. Some of the individual collaborators had funding from other sources and these appear in acknowledgements in small font at the end of the papers. The next two most cited articles (with over 1000 citations each) also published in The Lancet were jointly funded by BMGF and the WHO, also prominently displayed after the article summary. They were produced by the Global Burden of Metabolic Risk Factors of Chronic Diseases Collaborating Group. The article ranked seventh is the main outlier among the 10 articles. Produced by a group called The WHO Rapid Pandemic Assessment Collaboration, it reports on the potential danger of a strain of the H1N1 virus that had pandemic potential. It was published in Science and BMGF was one of numerous institutions whose staff provided 'support.' Number 8 on our list produced estimates of world-wide childhood mortality. It was authored by members of the Child Health Epidemiology Reference Group of WHO and UNICEF, dominated by WHO staff but with prominently displayed funding from BMGF. Number 9 was an analysis of efforts to control global malaria mortality from 1980 to 2010 whose lead author was Chris Murray himself. Number 10 was funded by BMGF and WHO as part of the Child Epidemiology Reference Group of WHO and UNICEF, and aimed to estimate the global burden of disease attributable to respiratory syncytial virus among young children.
The largest private philanthropy in the world (disbursing nearly 3 billion dollars in 2015), BMGF, devotes a significant portion (about one-third from 1998 to 2007) of its massive funding for research of all sorts (McCoy et al, 2009;Blanchet et al, 2013). Its size and influence also make it a target of frequent and vociferous criticism. (For a recent but hardly unique example see McGoey, 2015.) Our data do not allow for overall judgments about the Foundation or its strategic choices, and different metric criteria may well produce somewhat different rankings of institutions that fund highly cited articles. But what is clear is that the overrepresentation of BMGF among ESI highly cited articles is due to its major role in funding large-scale quantitative research which, we know from our maps, has become central to the GH publication enterprise and its most cited component. Its strategic influence is compounded by its relative generosity. Articles funded by BMGF have fewer co-funders than those supported by other institutions. The 220 articles in which BMGF was involved contain 1075 mentions of funding translating into an average 4.9 funders per article. The 429 articles funded by the NIH have an average of 5.8 funders per article, while the 138 articles funded by WHO have an average of 9.7. BMFG is clearly less likely to participate in broad research-funding consortia than many other funders.
We get a better perspective on co-funding patterns in Figure 8. Of the 2177 articles with funding information, 925 have only a single funder, leaving us with 1252 co-funded articles as the basis for our map. We limit ourselves to the top 100 funding institutions, the size of the nodes being proportional to the total number of papers funded. Connections between nodes indicate co-funding of articles above a specificity (or random-occurrence) threshold, which accounts for the fact that, in spite of the previously mentioned co-funding activities between BMGF and WHO, these two nodes are not connected. Overall, it is not surprising that there is a certain overlap among institutions in the articles they fund. There is a significant cluster 1 around the US NIH, the BMGF, and a variety of other state funding institutions mainly in English-speaking countries but including Sweden. These have extensive connections to other clusters including cluster 2 made up predominantly of British funding agencies and the European Union. Canadian institutions appear in both clusters 1 and 2 suggesting that researchers in Canada make use of both their North American and Commonwealth connections in obtaining funding. A distinct cluster 3 around WHO may reflect its rather small research budget but also its funding of research in a variety of developing countries ignored by other funders. All these examples, particularly the close proximity between NIH and BMGF, suggest that emerging 'philanthrocapitalism' when applied to research is indeed a hybrid configuration in which governments and philanthropies remain closely allied and support each other (McGoey, 2014).
Finally, at the bottom we see clusters 4 and 5 made up predominantly of pharmaceutical companies and rather isolated from other major funders (with a few exceptions like PEPFAR, the U.S. President's Emergency Plan for AIDS Relief's) but closely and intricately linked among themselves. This is not surprising. Researchers who are successful enough to be tagged by one pharmaceutical company are likely to be recruited by other companies looking for experts to sit on boards, lecture, or advise, and whose research they might fund. They are also likely to get funding from non-pharmaceutical sources, but these will be project specific so that links are less dense than those among pharma companies. There are a number of bridging institutions including the US Centers for Disease Control and, more surprisingly, the government of Switzerland.

Conclusion
Our scientometric analysis of publications does not permit an engagement with large GH themes like security, globalization, neo-liberalism, and humanitarianism frequently discussed in policy and social science literature. Such engagement requires deep semantic analysis of texts that we leave for another occasion. Nor can our analysis serve as a proxy for the wider and very complex GH domain as a whole. Publications make up only a small part of what is considered GH. But it is possible to say that from the perspective of its vast literature, GH appears as an assemblage of individuals, groups, and institutions concerned with diverse sets of issues -diseases categories primarily, but also policy positions, funding choices, scientific issues, and disciplinary interests -that occasionally come together for specific purposes in different permutations and combinations, and that are all identified under the by-now fashionable term 'Global Health.' But it turns out that more than labels hold together this fuzzy intellectual domain. There is in the first instance a core group of authors who have for some time been co-producing papers and reports with one another, and whose publications are highly cited within this literary domain. There is in the second instance a number of central journals, and most notably The Lancet, that transcend the narrow specialization characteristic of scientific fields and that serve as major sounding boards for authors seeking a wide audience. Finally, there is a growing body of large-data metrics, most prominently the GBD that, whatever its origins in World Bank development strategies, and whatever the critiques that continue to be raised against it, now produces data that numerous different groups use for their own distinctive purposes. The GBD and metrics generally has, in other words, 'changed the conversation' as several commentators have noted (most recently Adams, 2016;Fan and Uretsky, 2016;Wahlberg and Rose, 2015).
One way of conceptualizing this pattern is to suggest that the term GH corresponds to a prototypical category (as defined by cognitive scientists) 10 that provides coherence to an otherwise extremely heterogeneous domain. More precisely, clusters of publications that center on metrics -in particular on the development of statistical tools to quantify the 'global burden' of diseases -lie at the core of the domain with linkages extending to more marginal areas. In so doing, this core component equips the entire domain with a distinctive (albeit fuzzy) identity that can be further mobilized for a range of different purposes, which often amount to a seamless combination of political and techno-scientific publications. The (in)famous motto 'if you can't measure it, it doesn't exist' seems to be particularly (and 10 In contrast to definition-based models, prototypical categories include a range of entities that may differ substantially but that are more or less related to some central works that are particularly important to that category (see Rosch, 1973).
Mapping Global Health reflexively) appropriate in this respect. There are undoubtedly other linkages that cannot be identified using the mapping techniques of this paper, and that require qualitative as well as different kinds of quantitative methodologies to become visible and susceptible to analysis. But the core structures revealed by this analysis suggest a few of the elements that hold together this elusive but mushrooming publication domain, and provide a useful starting point for further research.