Another instance of a combination of trends in some circles :
* the publishing of one's own medical data * the application of open data principles to one's own data * the crowd sourcing of research, and * some faith in the generally beneficial net effects of information technologies (wonders if many Arabs or Chinese would do the same) Pierrot ------------------------------------------------------------------------ Open Sourcing My Genetic Data ByManuSporny <http://manu.sporny.org/author/manusporny/>On*February 12, 2011*InGenetics <http://manu.sporny.org/category/genetics/>With29 Comments <http://manu.sporny.org/2011/public-domain-genome/#comments>Permanent Link to Open Sourcing My Genetic DataPermalink <http://manu.sporny.org/2011/public-domain-genome/> http://manu.sporny.org/2011/public-domain-genome/ Today, I published all of my known genetic data as open source andreleased all my rights to the data <https://github.com/msporny/dna/raw/master/README>. Roughly 1 million of my genetic markers are now in the public domain. I believe that I’m one of the first people in the world tocommit my genetic data into a decentralized source control system <https://github.com/msporny/dna>[ed:orta <https://github.com/orta/dna>was the first]. The first reactions that I received when I told some of my friends that I was going to do this was a combination of shock and skepticism. /“Why would you do something like that?”/ /“Aren’t you afraid that somebody is going to use that against you?/ /“What if your healthcare provider got a hold of that? They’d love to look through it in order to deny you for some pre-existing condition!”/ /“Ugh, I’d never want to know that sort of stuff about myself!”/ /“What if somebody clones you!?”/ I’ve thought long and hard about each of those questions and the many more that you ask yourself before publishing this sort of personal data. There are large privacy implications in doing this. However, speaking solely for myself, I think the benefits outweigh the drawbacks. I’ll explain my thought process behind each of those questions in a separate blog post. However, the result of that thought process is that I’m releasing my genetic data today – that’s what I’d like to focus on in this blog post. So, let’s explore exactly what this data is and how I hope people that write software will use it. Your Genetic Code There is a website called23andme.com <http://www.23andme.com/>that is in the business of analyzing your DNA. To become a member of the service, you pay a fee, they send you a test tube, you spit in the test tube and send it back to them. They then take your spit and place it onto something called a*genotyping beadchip*. In this particular case, my spit was placed onto the/Illumina OmniExpress Plus Genotyping Beadchip/. This particular chip is capable of detecting around one million genetic markers. These markers are called*single-nucleotide polymorphisms*orSNP <http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism>s (pronounced ‘snip’) for short. In combination, these SNPs can tell you quite a bit about your genetic makeup. Things such as your eye color, hair color, hair curl, whether you are at an increased risk for diabetes, where your ancestors came from, or even things like if you’re resistant to the HIV virus or if you have the type of muscles that would make you a good sprinter. There are around 10 million SNPs in the human genome, the Illumina chip can currently analyze around 1 million of them (966,977 – to be exact). Of those roughly 1 million pieces of data, all of science only knows what around 14,515 of them do. Of the SNPs that we know about, we’re still shaky about all of the things that many of them affect – we’re not so sure about what the data is telling us. On the 23andme site, they only list around 160 SNPs and their effect on you. This means that of the raw data I’m publishing today, science still doesn’t know what 952,462 of these markers do. Talk about a treasure trove of information, just waiting to be unlocked! As science marches steadily onward, we’ll learn more about each one of those 952,462 markers and how they affect how we are born, grow, live and die. One of the best features of 23andme is that they allow you to download your entire genetic profile from the Illumina chip in a raw, non-proprietary format. This is very big news for people that are capable programmers. It means that for the first time in history, there is an inexpensive service that can extract, decode and export your genetic information to a non-proprietary file format. Commit-ment As an open source software developer, there are certain commits that you make to a public source code repository that leave you feeling better about the state of the world. This was certainly one of them for me: msporny@tao:~/work/dna$ git add ManuSporny-genome.txt msporny@tao:~/work/dna$ git commit -a [master a08b027] Added my genome into source control. 1 files changed, 966992 insertions(+), 0 deletions(-) create mode 100644 ManuSporny-genome.txt Doing that made me realize how quickly we’re narrowing in on some of the most debilitating human diseases. It gave me hope that our children may enjoy a far better quality of healthcare than we do today. Most of all, it gave me hope that we will be able to better help the nurses, doctors and medical researchers as a society – more than with just money, but with our time, expertise and energy. That commit sent chills up my spine – to me, it symbolized a brighter future for all of us. So, now that all of us can get a hold of that data, what can we do with it? Analyzing your Genetic Data 23andme does a great job giving you reports on research that they’re confident of, for example, I’m at a 13.4% increased risk for Age-related Macular Degeneration. The average is 7% – which means that I’m about 1.91 times more likely than the average person to start losing my eyesight as a result of old age. This makes sense as one of my grandparents has a bad case of age-related macular degeneration. There are around 160 of these types of reports that you get with your 23andme data, but what if you want to dive deeper into your genetic code? Code is code, whether it is 1s and 0s or A, G, C, and Ts. Analyzing code and data is something that many Computer Scientists do quite often and quite well. Think of the amount of data that Facebook, Google and Twitter deal with on a daily basis. Think about how quickly you can search over a trillion documents on Google (less than a second in most cases). Personally, I was expecting the same sort of instant searching and analysis functionality on 23andme. It’s just not there. Don’t get me wrong, 23andme is a great service and if this kind of stuff interests you, you should definitely get a kit right now. The kits go on sale twice a year. I got my spit analyzed for $150 total – it’s a deal, any way that you look at it. That and you get instant access to your raw data – that’s the best part. However, searching through your raw data on 23andme sucks. Remember, there are only about 160 reports on the 23andme site, but there are over 14,515 SNPs that are known. If you want to find out more than just the 160 reports that 23andme has, there is this great website out there calledSNPedia.com <http://www.snpedia.com/>. SNPedia is basically the Wikipedia of genetic information. Keep in mind there are usually many SNPs that come into play for traits like eye color, hair color or certain types of cancer, or where your ancestors came from. 23andme does the heavy lifting for most of their reports, but there are many SNPs that they don’t show you in their reports. So, if you want to find out about anything that is not on the 23andme site, you have to manually search for the SNPs you’re looking for on SNPedia. To make this even more difficult, SNPs have fairly opaque names likers1815739 <http://www.snpedia.com/index.php/Rs1815739>. If you are looking for more than 1 SNP, it can take a long time. You have to first look up the original SNP that interests you on SNPedia. Once you have the original marker on the screen, it might link to upwards of 10 additional SNPs that affect the trait you’re researching. You have to manually type in each SNP one-by-one into the 23andme site, click “Search”, write down the sequence for that SNP, such as “GG” or “AA” and repeat this process for as many SNPs as you’re looking for. What can Web Programmers do for Genetics? Manually searching for these markers is unnecessarily time consuming. Doing stuff like this is why we have computers – they’re good at computing! Your genetic data fits in 25 Megabytes of memory – a tiny, tiny fraction of the tiniest USB thumb-drive. This genetic data is the equivalent of 5 MP3 songs, a small website, or 5-7 high resolution digital photos. You can type a Google search for “eye color” and get back a result in less than a second after searching/the entire Internet/. Why can’t you do that for your genetic data? I think programmers, especially Web programmers, can do better. That’s the driving reason that I’m releasing this data into the public domain. I’d like to see an open source website that can search SNPedia in the blink of an eye – just like Google Instant does. If I type in “blood type” it should tell me all of the things it can find out about my blood type. If I type in “eyes”, it should be able to tell me everything that it knows about me concerning macular degeneration, eye color, etc. There is a lot of data out there on SNPedia, we just need a nice, personalized interface to work with it. That’s just one idea, though. There are thousands of other ideas hidden away out there on the Web. One of them may be hiding in that beautiful brain of yours. I hope that you will share this story with other people that may be interested in helping us to reduce suffering in the world. I hold great hope for this new technology – we are primed for some amazing health-related advances in our lifetime. If you know how to program, design or write – you can help. You can start by blogging or tweeting about this post, or you can: Download Manu Sporny’s genetic data. <https://github.com/msporny/dna> |
It is interesting to note that 23andMe was co-founded by Anne Wojcicki,
who is married to Sergey Brin of Google. Also of interest, Google invested a total of 11.9 million dollars in 23andMe. http://en.wikipedia.org/wiki/23andMe -- Catherine Roy http://www.catherine-roy.net On Sun, February 13, 2011 11:54 am, Pierrot Péladeau wrote: > Another instance of a combination of trends in some circles : > > * the publishing of one's own medical data > * the application of open data principles to one's own data > * the crowd sourcing of research, and > * some faith in the generally beneficial net effects of information > technologies (wonders if many Arabs or Chinese would do the same) > > Pierrot > > ------------------------------------------------------------------------ > > > Open Sourcing My Genetic Data > > ByManuSporny <http://manu.sporny.org/author/manusporny/>On*February 12, > 2011*InGenetics <http://manu.sporny.org/category/genetics/>With29 > Comments > <http://manu.sporny.org/2011/public-domain-genome/#comments>Permanent > Link to Open Sourcing My Genetic DataPermalink > <http://manu.sporny.org/2011/public-domain-genome/> > > http://manu.sporny.org/2011/public-domain-genome/ > > Today, I published all of my known genetic data as open source > andreleased all my rights to the data > <https://github.com/msporny/dna/raw/master/README>. Roughly 1 million of > my genetic markers are now in the public domain. I believe that Im one > of the first people in the world tocommit my genetic data into a > decentralized source control system > <https://github.com/msporny/dna>[ed:orta > <https://github.com/orta/dna>was the first]. The first reactions that I > received when I told some of my friends that I was going to do this was > a combination of shock and skepticism. > > /Why would you do something like that?/ > /Arent you afraid that somebody is going to use that against you?/ > /What if your healthcare provider got a hold of that? Theyd love to > look through it in order to deny you for some pre-existing condition!/ > /Ugh, Id never want to know that sort of stuff about myself!/ > /What if somebody clones you!?/ > > Ive thought long and hard about each of those questions and the many > more that you ask yourself before publishing this sort of personal data. > There are large privacy implications in doing this. However, speaking > solely for myself, I think the benefits outweigh the drawbacks. Ill > explain my thought process behind each of those questions in a separate > blog post. > > However, the result of that thought process is that Im releasing my > genetic data today thats what Id like to focus on in this blog post. > So, lets explore exactly what this data is and how I hope people that > write software will use it. > > > Your Genetic Code > > There is a website called23andme.com <http://www.23andme.com/>that is in > the business of analyzing your DNA. To become a member of the service, > you pay a fee, they send you a test tube, you spit in the test tube and > send it back to them. They then take your spit and place it onto > something called a*genotyping beadchip*. In this particular case, my > spit was placed onto the/Illumina OmniExpress Plus Genotyping Beadchip/. > This particular chip is capable of detecting around one million genetic > markers. These markers are called*single-nucleotide polymorphisms*orSNP > <http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism>s > (pronounced snip) for short. > > In combination, these SNPs can tell you quite a bit about your genetic > makeup. Things such as your eye color, hair color, hair curl, whether > you are at an increased risk for diabetes, where your ancestors came > from, or even things like if youre resistant to the HIV virus or if you > have the type of muscles that would make you a good sprinter. > > There are around 10 million SNPs in the human genome, the Illumina chip > can currently analyze around 1 million of them (966,977 to be exact). > Of those roughly 1 million pieces of data, all of science only knows > what around 14,515 of them do. Of the SNPs that we know about, were > still shaky about all of the things that many of them affect were not > so sure about what the data is telling us. On the 23andme site, they > only list around 160 SNPs and their effect on you. This means that of > the raw data Im publishing today, science still doesnt know what > 952,462 of these markers do. Talk about a treasure trove of information, > just waiting to be unlocked! As science marches steadily onward, well > learn more about each one of those 952,462 markers and how they affect > how we are born, grow, live and die. > > One of the best features of 23andme is that they allow you to download > your entire genetic profile from the Illumina chip in a raw, > non-proprietary format. This is very big news for people that are > capable programmers. It means that for the first time in history, there > is an inexpensive service that can extract, decode and export your > genetic information to a non-proprietary file format. > > > Commit-ment > > As an open source software developer, there are certain commits that you > make to a public source code repository that leave you feeling better > about the state of the world. This was certainly one of them for me: > > msporny@tao:~/work/dna$ git add ManuSporny-genome.txt > msporny@tao:~/work/dna$ git commit -a > [master a08b027] Added my genome into source control. > 1 files changed, 966992 insertions(+), 0 deletions(-) > create mode 100644 ManuSporny-genome.txt > > Doing that made me realize how quickly were narrowing in on some of the > most debilitating human diseases. It gave me hope that our children may > enjoy a far better quality of healthcare than we do today. Most of all, > it gave me hope that we will be able to better help the nurses, doctors > and medical researchers as a society more than with just money, but > with our time, expertise and energy. That commit sent chills up my spine > to me, it symbolized a brighter future for all of us. > > So, now that all of us can get a hold of that data, what can we do with > it? > > > Analyzing your Genetic Data > > 23andme does a great job giving you reports on research that theyre > confident of, for example, Im at a 13.4% increased risk for Age-related > Macular Degeneration. The average is 7% which means that Im about > 1.91 times more likely than the average person to start losing my > eyesight as a result of old age. This makes sense as one of my > grandparents has a bad case of age-related macular degeneration. There > are around 160 of these types of reports that you get with your 23andme > data, but what if you want to dive deeper into your genetic code? > > Code is code, whether it is 1s and 0s or A, G, C, and Ts. Analyzing code > and data is something that many Computer Scientists do quite often and > quite well. Think of the amount of data that Facebook, Google and > Twitter deal with on a daily basis. Think about how quickly you can > search over a trillion documents on Google (less than a second in most > cases). > > Personally, I was expecting the same sort of instant searching and > analysis functionality on 23andme. Its just not there. Dont get me > wrong, 23andme is a great service and if this kind of stuff interests > you, you should definitely get a kit right now. The kits go on sale > twice a year. I got my spit analyzed for $150 total its a deal, any > way that you look at it. That and you get instant access to your raw > data thats the best part. > > However, searching through your raw data on 23andme sucks. Remember, > there are only about 160 reports on the 23andme site, but there are over > 14,515 SNPs that are known. If you want to find out more than just the > 160 reports that 23andme has, there is this great website out there > calledSNPedia.com <http://www.snpedia.com/>. SNPedia is basically the > Wikipedia of genetic information. > > Keep in mind there are usually many SNPs that come into play for traits > like eye color, hair color or certain types of cancer, or where your > ancestors came from. 23andme does the heavy lifting for most of their > reports, but there are many SNPs that they dont show you in their > reports. So, if you want to find out about anything that is not on the > 23andme site, you have to manually search for the SNPs youre looking > for on SNPedia. To make this even more difficult, SNPs have fairly > opaque names likers1815739 <http://www.snpedia.com/index.php/Rs1815739>. > > If you are looking for more than 1 SNP, it can take a long time. You > have to first look up the original SNP that interests you on SNPedia. > Once you have the original marker on the screen, it might link to > upwards of 10 additional SNPs that affect the trait youre researching. > You have to manually type in each SNP one-by-one into the 23andme site, > click Search, write down the sequence for that SNP, such as GG or > AA and repeat this process for as many SNPs as youre looking for. > > > What can Web Programmers do for Genetics? > > Manually searching for these markers is unnecessarily time consuming. > Doing stuff like this is why we have computers theyre good at > computing! Your genetic data fits in 25 Megabytes of memory a tiny, > tiny fraction of the tiniest USB thumb-drive. This genetic data is the > equivalent of 5 MP3 songs, a small website, or 5-7 high resolution > digital photos. You can type a Google search for eye color and get > back a result in less than a second after searching/the entire > Internet/. Why cant you do that for your genetic data? > > I think programmers, especially Web programmers, can do better. Thats > the driving reason that Im releasing this data into the public domain. > Id like to see an open source website that can search SNPedia in the > blink of an eye just like Google Instant does. If I type in blood > type it should tell me all of the things it can find out about my blood > type. If I type in eyes, it should be able to tell me everything that > it knows about me concerning macular degeneration, eye color, etc. There > is a lot of data out there on SNPedia, we just need a nice, personalized > interface to work with it. > > Thats just one idea, though. There are thousands of other ideas hidden > away out there on the Web. One of them may be hiding in that beautiful > brain of yours. I hope that you will share this story with other people > that may be interested in helping us to reduce suffering in the world. I > hold great hope for this new technology we are primed for some amazing > health-related advances in our lifetime. If you know how to program, > design or write you can help. You can start by blogging or tweeting > about this post, or you can: > > Download Manu Spornys genetic data. <https://github.com/msporny/dna> > > _______________________________________________ > CivicAccess-discuss mailing list > [hidden email] > http://lists.pwd.ca/mailman/listinfo/civicaccess-discuss > |
Free forum by Nabble | Edit this page |