Tag Archives: Health

A crow with a pack of cigarettes

One of my professors used to say, smoking a cigarette is like shooting with a shotgun on your DNA and he might have been right.
That smoking isn’t beneficial for your health shouldn’t be something new for you. But there are many other unhealthy habits lots of people have.

But well-being is also a difficult topic as a lot of misinformation as spread. And common sense is not necesarrily the best way to decide, what is healthy.
Furthermore reading scientific papers on this topic is hard. And it’s even harder to integrate findings from different studies. In addition to that there might not be a one size fits all solution. People are different and so is the kind of lifestyle, that would be healthy for them.

I will however here try my best to talk about health. In this manner I will try to summarise scientific papers for you. And furthermore provide you with sources, that will enable you to educate yourself on this topic.

If you have any topic suggestions for me, just write me a message or comment. I will see, what I can do! :)

Activated Charcoal Latte

Does It Make You Slim?

Is a question asked me someone, after the person saw the I began blog with scientific topics. This someone apparently got a lot of ads for activated charcoal latte, which is also called Black Latte here in Germany.
A few months ago, when I got the question, I went on PubMed and searched for papers regarding this topic…
Tl;dr I didn’t find any! 😀
However this doesn’t mean that there aren’t any. And in no case it means that it doesn’t.

A picture of an empty coffee cup, which didn't contain Activated Charcoal Latte
A picture of a coffee cup I took. The beverage was coffee, not activated charcoal latte however

But! It seems, that there is some evidence from around the 80s and 90s, that activated charcoal can reduce LDL-cholesterol (the bad one) in patients with high blood cholesterol(10.1016/S0140-6736(86)90054-1). Neuvonen et al even postulated a mechanism behind this effect(10.1111/j.1365-2362.1989.tb00226.x).

Sounds nice doesn’t it? Maybe not as nice as getting slim, but still nice. Well… Unless you consider some of the adversary effects of it. According to webMD it can e.g. cause constipation and reduce the absorption of some nutrients. So you probably shouldn’t take it, including activated charcoal latte, as a regular supplement, unless your doctor tells you to.

Huh… Short post today? It seems so. 🙂 If you like this format (and also if not), please tell me! And if you have a question, just ask me. I’ll do my best in answering it.

Please follow and like us:


Map Plots About the Global Burden of Disease

A practical example

Like promised in another post I will show you how to do a map plots with R. For this purpose I will use the ggmap package, which makes this a relatively easy task.
But before I begin with the actual code, let me give you a short motivation

Why to use map plots

Motivations for using map plots can be various. For example if you’re a journalist and let’s say you want to visualize a kind of events (like say earthquakes) in a regional context, this is a very demonstrative way of visualizing your data.
Or if you want to present some kind of data about places or countries map plots are always a good option.

The first time I did a map plot was actually part of an awesome lecture I had back in Munich at the TUM. Afterwards I got the chance to use this skill right away in the next semester for my Bachelor’s thesis.
As some of you might know the area, where I applied the algorithm which I improved and implemented for my thesis, was mental disorders. During the writing process of my thesis, I found it a good starting point in my thesis and the accompanying presentation to emphasize the prevalence of mental disorders in the world. In order to do so I used a map plot.
That’s basically also what I will do now, but this time with the case of cancer.

But on a side note I’m not saying that you should do those kinds of plots for each thesis or presentation regarding a disease topic. It’s just one possible starting point for it and not necessarily the best.
So please don’t just mindlessly copy, what I’m doing here. 🙂

Getting the data for dissease related map plots

First let’s load all the packages we will need for this little exercise.


XLConnect is a package for loading excel sheet, which we will need.
That I like to use data.table you probably already noticed. It’s just super fast and comfy for some procedures and it has some nice synergies with ggplot2.
The maps package contains as the name suggests map data, which can be used to plot. Alternatively one could also use ggmap.
And ggthemes contains a neat theme for maps, which I will use.

First let’s load our world map. This data.table contains region names and the boundaries of those regions as longitudes and latitudes. ggplot can plot those as polygons.

mapdata <- data.table(map_data("world"))

OK, done. Now we need to download the data on 2004’s mortality from the WHO.

download.file("www.who.int/entity/healthinfo/global_burden_disease/gbddeathdalycountryestimates2004.xls", "gbd.xls")
tmp <- readWorksheetFromFile(file = "gbd.xls", sheet = "Deaths 2004")

causes <- tmp$Col1[14:143]

countries <- unname(tmp[6,7:198])
deathRates <- tmp[14:143,7:198]

You should probably take a look at the Excel file yourself to understand it and what I’m doing later. The file is made for humans to look at it and not directly for machines to read it. Which is why we have to do some cleaning and transforming. In my experience as a Bioinformatics students this is something you have to do almost always. Even if you have a machine readable format, there’s no perfect data-set. You will always have some missing data or have to transform your data in some way.
And this isn’t necessarily a trivial step. Often you will spend a lot of time here. And that’s good. If cleaning data was trivial, then we wouldn’t need data scientist.

Cleaning data

To begin with we have to transform the death rates to numeric values… Because they’re characters (strings) right now. For this purpose we have to also replace the separating comma at the thousand position. You see? What’s done to make the data more human readable, makes it less machine readable. That’s often the case.
Then we set the column names to the countries and transform the matrix together with the vector of causes to a data.table.

deathRatesNum <- matrix(as.numeric(gsub(",", "", as.matrix(deathRates))), nrow = dim(deathRates)[1])
## Warning in matrix(as.numeric(gsub(",", "", as.matrix(deathRates))), nrow =
## dim(deathRates)[1]): NAs introduced by coercion
colnames(deathRatesNum) <- countries
DT <- data.table(causes = causes, deathRatesNum)

Now we want a clean or also called long data-set. In this new data set we will have only three columns. Two variables (causes and region), which uniquely identify the value death rate.
Similar to a database we can also set those variable columns as keys, which makes it very fast searchable.

DTclean <- melt(DT, id.vars = "causes",  variable.name = "region", value.name = "deathRate")
setkey(DTclean, causes, region)

Next let us see, if we have some regions in our data.table that aren’t in our map.

DTclean[!region %in% mapdata$region, unique(region)]
##  [1] Antigua and Barbuda                      
##  [2] Brunei Darussalam                        
##  [3] Congo                                    
##  [4] Côte d'Ivoire                            
##  [5] Democratic People's Republic of Korea    
##  [6] Iran (Islamic Republic of)               
##  [7] Lao People's Democratic Republic         
##  [8] Libyan Arab Jamahiriya                   
##  [9] Micronesia (Federated States of)         
## [10] Republic of Korea                        
## [11] Republic of Moldova                      
## [12] Russian Federation                       
## [13] Saint Kitts and Nevis                    
## [14] Saint Vincent and the Grenadines         
## [15] Serbia and Montenegro                    
## [16] Syrian Arab Republic                     
## [17] The former Yugoslav Republic of Macedonia
## [18] Trinidad and Tobago                      
## [19] Tuvalu                                   
## [20] United Kingdom                           
## [21] United Republic of Tanzania              
## [22] United States of America                 
## [23] Venezuela (Bolivarian Republic of)       
## [24] Viet Nam                                 
## 192 Levels: Afghanistan Albania Algeria Andorra ... Zimbabwe

As expected, there are 24 regions from the WHO sheet not in the mapdata. Even though there’s probably a more elegant solution, I will change them manually. It’s a work that has to be done once. For this purpose it’s probably only necessary to fill it in for the big countries. So this is bearable.

DTclean[region == "Brunei Darussalam", region := "Brunei"]
DTclean[region == "Congo", region := "Republic of Congo"]
DTclean[region == "Democratic People's Republic of Korea", region := "North Korea"]
DTclean[region == "Iran (Islamic Republic of)", region := "Iran"]
DTclean[region == "Côte d'Ivoire", region := "Ivory Coast"]
DTclean[region == "Lao People's Democratic Republic", region := "Laos"]
DTclean[region == "Libyan Arab Jamahiriya", region := "Libya"]
DTclean[region == "The former Yugoslav Republic of Macedonia", region := "Macedonia"]
DTclean[region == "Micronesia (Federated States of)", region := "Micronesia"]
DTclean[region == "Republic of Moldova", region := "Moldova"]
DTclean[region == "Republic of Korea", region := "South Korea"]
DTclean[region == "Russian Federation", region := "Russia"]
DTclean[region == "Serbia and Montenegro", region := "Serbia"]
DTclean[region == "Syrian Arab Republic", region := "Syria"]
DTclean[region == "United Republic of Tanzania", region := "Tanzania"]
DTclean[region == "United Kingdom", region := "UK"]
DTclean[region == "United States of America", region := "USA"]
DTclean[region == "Venezuela (Bolivarian Republic of)", region := "Venezuela"]
DTclean[region == "Viet Nam", region := "Vietnam"]

And yea of course the work isn’t done completely yet. We also should check if there are regions in the mapdata, that aren’t in the WHO data-set. This could be due to various reasons… One being, that a region isn’t a member of the WHO and therefore the WHO doesn’t publish data on them.
Or more likely that a country from the WHO data-set span more than one region on the map, Serbia and Montenegro being such a case.
However I’m lazy now and I won’t do this today. How about you doing it and writing me a comment? 😛 Let it be a team1 effort.

Making the map plots

OK, before we do the actual plotting let’s first calculate for how much percentage of all deaths in each country cancer is the cause. In detail I do this by joining the data.table with itself.
On a side note: W000 is the WHO code for all death causes combined and W060 for Malignant neoplasms, which is a more formal name for cancer.

Then we need to join the data.table with the map on the region name.

DTcaused <- DTclean[causes == "W000"][DTclean[causes == "W060"], on = "region"][, .(region, percentageCaused = i.deathRate / deathRate)]

deathrateMap <- mapdata[DTcaused, on = "region", allow.cartesian=TRUE, nomatch = 0]

And finally we can do our plot. For this purpose we first plot all regions in grey and as overlay we fill the countries, that we have data on, with a color between grey and red depending on how frequent cancer as a death cause is.

g <- ggplot() + geom_polygon(data = mapdata,  aes(long, lat, group = group), fill = "grey")  
g <- g +  geom_polygon(data = deathrateMap,  aes(long, lat, group = group, fill = percentageCaused)) 
g <- g + scale_fill_gradient(low = "grey", high = "red", aesthetics = "fill", name = "Percentage of\ndeaths caused\nby cancer") 
g + ggthemes::theme_map()
World map showing on a scale from grey to red the percentage of deaths cancer is responsible for in different countries. The USA, Canada, Australia and Europe according to it have the highest death rates due to cancer, with up to 30 percentage.
World map showing on a scale from grey to red the percentage of deaths cancer is responsible for in different countries

And of course there’s one thing about this plot that could be misleading. Given that regions with missing data and very low prevalence of cancer deaths will both be grey, you hopefully see the potential problem here?
It’s not necessarily wrong or bad to do so. But I hope you recognize how someone could make a plot this way to mislead his audience. That’s why I recommend when it comes to looking at plots not only to think about, what is shown, but also what isn’t shown. Since no large data-set is complete… So ask the person who presents it to you, how she/he handled missing data points.

So what does this map actually say? From my perspective I don’t think anything surprising. At the moment, this data set captured, cancer was (and probably still is) mostly a problem of industrialized countries and it doesn’t seem to be connected to geography primarily (Can you see how Israel, Japan and South Korea pop up?).
Although the difference between the USA and Canada could be something interesting.
But this map, in my opinion, shows very clearly that cancer is one of the leading causes of death in the developed world, which also is the reason, why we also spend so much money on researching it.

However the main purpose of this post was to show you, how to make such plots and not discuss the reasons of different causes of mortality.
Ultimately I hope that this post has helped you.

Cite/mention your sources

Of course it is important that you mention your sources (cite them if you write a paper). This is because your approach has to be reproducible and you have to give those people, who did the preliminary work, credit for it.

In R you can get the proper citations for the packages you used the following way:

## To cite ggmap in publications, please use:
##   D. Kahle and H. Wickham. ggmap: Spatial Visualization with
##   ggplot2. The R Journal, 5(1), 144-161. URL
##   http://journal.r-project.org/archive/2013-1/kahle-wickham.pdf
## A BibTeX entry for LaTeX users is
##   @Article{,
##     author = {David Kahle and Hadley Wickham},
##     title = {ggmap: Spatial Visualization with ggplot2},
##     journal = {The R Journal},
##     year = {2013},
##     volume = {5},
##     number = {1},
##     pages = {144--161},
##     url = {https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf},
##   }
## To cite package 'maps' in publications use:
##   Original S code by Richard A. Becker, Allan R. Wilks. R version
##   by Ray Brownrigg. Enhancements by Thomas P Minka and Alex
##   Deckmyn. (2018). maps: Draw Geographical Maps. R package version
##   3.3.0. https://CRAN.R-project.org/package=maps
## A BibTeX entry for LaTeX users is
##   @Manual{,
##     title = {maps: Draw Geographical Maps},
##     author = {Original S code by Richard A. Becker and Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P Minka and Alex Deckmyn.},
##     year = {2018},
##     note = {R package version 3.3.0},
##     url = {https://CRAN.R-project.org/package=maps},
##   }
## ATTENTION: This citation information has been auto-generated from
## the package DESCRIPTION file and may need manual editing, see
## 'help("citation")'.

You get the idea. Also cite the other packages, if you use them in your publication or thesis.
The output is in bibtex format. So I hope you know what to do with it. 😛

Of course the data on the global burden of disease you have to cite as well. Thus I’ll give you the formatted citation for it:

WHO. (2004). The global burden of disease: 2004 update: causes of death. 2004 Update, 8–26.

And last, but not least, please also mention me. This however is not a necessity, but a sign of respect towards my work. By all means respect is an important thing, unfortunately not often enough given in our society.

Please follow and like us: