Oct 202012

I just came back from a Hopkins epigenetics meeting and heard a keynote talk from Bert Vogelstein. They have a robust way to finding real cancer drivers, which he believes is much smaller a set than other annotations. The rule is: > 5 interagency mutations, Tumor suppressor gene > 15% of mutations be inactivating, Oncogenes: > 15% recurrent activating mutations. Many well known cancer drivers have 80% mutations, so he believe this is a pretty reasonable criteria.

By this criteria, they only detected ~120 cancer drivers (~70 tumor suppressors and ~40 oncogenes). It is very hard to target lost tumor suppressors, so oncogenes are better drug targets. All rivers can be organized into 12 core pathways: TGF, Wnt, Hedgehog/GLI, HIF1A, JAK/STAT, NOTCH, G1/S transition, DNA damage, Apoptosis, Chromatin, PI3K/PTEN, RAS/RAF. Interestingly he believes that whole genome or exome sequencing are showing plateaux in terms of finding new driver genes, and many new drivers are related to epigenetics. The next step is to develop early detection method and understand what these drivers are really doing. In addition to understanding cancer drivers that his lab has been doing for years, they are recently looking sensitive biomarkers and techniques to detect early cancer mutations from blood, urine or stools.

Vogelstein mentioned that many cancer mutations that might cause drug resistance actually were already present when patients are first diagnosed with cancer, although they are only present a tiny percentage of cells. Some people call cells harboring these mutations tumor initiating cells or cancer stem cells, but they might just mean cancer heterogeneity (this is my interpretation). Unfortunately, most of cancer research is focused on treating patients at the last stage which is too late. He recommended the following about cancer treatment:
1. Treat tumors when they are as small as possible
2. Use combination agents
3. Is it ethical to use single agent drugs in Phase II/III trials? Phase I is OK, but Phase II/III with single agent would do patients harm, since patients are almost certain to develop drug resistance later.

I heard that Vogelstein rarely travels to present at conferences or seminars, and he refuses most of the meeting requests from speakers presenting seminars at Hopkins. He stays focused on important cancer research areas that he believes will make long term impact. But surprisingly when I email him to request reagents or information, he always replies by himself very quickly. I really admire this level of focus and science-driven curiocity. This was the first time I heard his talk (I unfortunately wasn’t there when he came spoke at DFCI in April 2012 when his postdoc trainee Nelly Polyak got tenured at Harvard) and I was totally inspired by his knowledge and dedication. I actually shed some silent tears afterwards and got a little distracted from Doug Higg’s talk afterwards which was also quite interesting (will have to read his papers to catch up). I hope we could do some solid cancer work deserving the inspiration he gave me.

Oct 182012

When I started writing blogs, I decided not to write negative things. Today, we did a journal club of all the transcriptome studies in ENCODE, and I just can’t bear it any more. It is mind boggling how much resources were wasted on ENCODE transcriptome. First of all, very little RNA-seq has been generated (~4 cell lines / year). In addition, different companion papers use different GENCODE versions (v7 and v8, mind you the current GENCODE is already v11!), could I trust their annotations, if so which version? The companion paper on lncRNA is especially shocking, and all the findings could be explained by an alternative hypothesis: that all the lncRNA they predicted are transcriptional noise. I actually liked the Mortazavi paper on RNA editing which I think are very carefully done, but he is probably not really in the transcriptome group.

For ENCODE to really generate useful resources to the community, why bother asking experts to review the proposals and NHGRI program officers to select grants for funding? I heard that the funding decisions were not really agreeing with review scores anyways. Instead let the community vote, every scientist could vote, and their vote could get higher weight if they could show that an ENCODE dataset is used to publish their own paper. I think the wisdom of the crowd will motivate the funded groups to consciously generate more useful and high quality data to the community.

Update: We were looking at annotations on transcription start sites of microRNAs. You would think after so many years of ENCODE $, GENCODE has some annotations on this, based on DNase-seq, pol2 / H3K4me3 ChIP-seq, conservation and some other sequence features. But nope, they only have start sites of mature miRNAs. Sigh!!

Oct 172012

Recently I went to PSU for a seminar and a junior faculty asked me about picking future research directions. It reminds me of the Harvard course taught by Tal Ben-Shahar on happiness. Basically to be happy, pick a direction based on three principles:

Interest: I found reading papers, listening (I only have time to “read” audio books on my commute, scientific and non-scientific) to books, and going to conferences a great way to refine my interest. I go to conferences not only to learn more science and people, refine my research interests, but also to help me find out which direction I definitely do not want to get into. Interest could change and evolve over time. There are cases when I was totally confused and frustrated by a speaker in my early career, only to find the same speaker absolutely fascinating a few years later because I knew more.

Advantage: This includes your natural abilities, previous background, expertise, connections, any available resources (天时,地利,人和) that give you a unique advantage. Actually advantage could stimulate interest. E.g. I get more motivated to improve our algorithm when I found it outperforms others.

Fulfill your social value: Find a research direction with a big picture goal that is appealing not only to your colleagues, but also to your family or the president. You could define short term directions towards the long term goal and make slight adjustment of your long term goals, but stay on course. Dedicated focused efforts over a long time could build your expertise and reputation to gradually fulfill your social value.

Aug 252012

I was looking at DMS Bulletin introducing new faculty at HMS last night, and saw that many new faculty gave excellent advices to graduate students. This prompted me to search for “advice to graduate students” on Google (interestingly “advices for graduate students” gives some different hits). Wow, what a treasure trove of wisdom. Here are a few excellent examples:

I wish someone had told me these when I started graduate school 15 years ago, and I wish I had shown our first graduate students at Tongji 3 years ago. Every graduate student should read it when they enter graduate school, and read it again when they have problems during graduate school. It is very interesting to see that some of the advices have been incorporated into our core values, and I should read me to see whether our core values could be improved.

This experience reminded me that we are in such an amazing information age. Any time I have a question, I can find good answers online. I spent several afternoons during the summer to teach Tongji students how to write CVs and papers, scientific marketing, etc. Probably better, I should teach them how to find answers online and learn themselves, which could help them career success after graduate school. I should also better use the web to teach myself.

So try it out, do some exploration such as:

How to respond to reviewers’ comments
Time management advices
How to select thesis project
Interacting with graduate advisor
How to read a biological paper
Work life balance tips
How to prepare a scientific talk

Have fun!

Aug 182012

On 7/30/2012, we had a senior lab member retreat for our Tongji lab. It was a very fruitful meeting, where senior members make plans for the following year, which includes grants to apply, new papers to submit, major research directions and projects, new student training, wet lab establishment, fall lab activities, and logistical issues about lab management.

In addition, we established the lab Core Values. Initially inspired by Zappos, Yong and I have been thinking of establishing the lab Core Values for a while. After some active discussions with the senior members, we came up with the following five points:

  1. Conduct world class research
  2. Make a solid contribution to the community
  3. Motivate scientific curiosity and discovery
  4. Train talents with integrity, initiative, perseverance, and optimism
  5. Build a collaborative and synergistic environment

Actually these probably fit well with my DFCI lab as well, and we will continue to use this as guidance for many of the activities in the lab. I should spend sometime to come up with my personal core values as well.

Jul 262012

In a recent NIH study section trip, Xihong told me that it is important to belong to a community. I have picked computational cancer epigenetics as my future research direction, which is naturally interdisciplinary. As a result, I don’t quite belong to any of the following communities (characterized by the conference they go to): statistics (JSM), bioinformatics (ISMB), genomics (Biology of Genome), chromatin (CSHL, Gordon, or Keystone conferences on chromatin and epigenetics), or cancer (AACR). Most of my closest colleagues to go the Cold Spring Harbor Systems Biology of Gene Regulation, which include people who use genomics and bioinformatics approaches to study transcriptional and epigenetic gene regulation. However, I follow their work closely, so sometimes don’t learn as much from this meeting any more.

Over the years, I have grown to enjoy the domain biology meetings such as cancer or chromatin meetings much more than the genomics and bioinformatics meetings. In the future, I should alternate between the CSHL Systems Biology and Biology of Genome meetings, then go to one cancer (AACR) and one epigenetics (keystone or CSHL) meeting every year. Recently there are also some good cancer epigenetics meetings, which could be very interesting.

Jul 152012

Wei Li stopped by Shanghai to attend the Tongji summer camp (for graduate student recruitment). He told me that Gongming Pu said that nowadays the way to publish a Cell, Nature, and Science (CNS) paper is to use new technologies to re-investigate decade-old problems that are published in CNS, and it is especially exciting when the new technology gives different results as previously reported in those CNS papers.

This is quite an interesting idea. I only determined to focus my research on cancer last spring, so don’t quite know the general landscape of the cancer field, nor understand what the big and important cancer problems are. I started searching for original research papers related to cancer that are published in Nature, Science, Cell, and Cancer Cell, and only found ~1800 hits since 1990. Even if I include JAMA and New England Journal, the total hit is less than 4000. If I read the abstracts, and occasional the full paper, of 20 papers a day four days a week, I could finish all in a year. I will do it with members of the lab in the coming year.

Jul 132012

In 2011, we spent some efforts looking at integrating ChIP-seq with GWAS data. That led me to the realization that for cancer studies, it is much more fruitful to study somatic mutations than germline mutations, and studying normal populations are less likely to be cost-effective.

Ever since we sequenced the LNCaP / abl and MCF7 / LTED genomes, I have been thinking of establishing our whole genome sequence analysis capacity in Tongji University, China. Our assistant professor Jianxing Feng got his CS PhD from Tsinghua University specializing in algorithms, so we thought that he would like the computational challenge. We held a focused journal club reviewing the high impact computational and biological papers for genome sequencing. To our surprise and disappointment, most of the existing algorithms are just brute force intuitive software with little algorithmic or statistic component.

Going to IBW, I realized that we are late in the whole genome or exome sequencing game. Many computational groups domestic and overseas are already analyzing massive amount of genome/exome sequencing data. The trend is clear, the first group can publish a good paper with only one whole genome; the second group will need to sequence 2 genomes; then future groups need to sequence 5 (pairs of) genomes, 10, 50, 100, etc to publish a good paper. The bar will rise just like for GWAS studies: the community would expect the sequencing studies to understand the function and consequences of these mutations. That’s where we have some expertise and should be prepared to make an impact.

Recent exome sequencing and whole genome sequencing comparing cancer normal or primary metastatic cancer genomes have yielded many exciting findings. The easy cases to investigate functional mutations are genes with copy number gain or loss, and most of these genes are clear oncogenes or tumor suppressors likely already identified before with CGH or SNP arrays. The functional consequences of these genes are easy to investigate with knockdown / knockout or over expression assays. Our current approach of combining RNA-seq with DNase-seq to profile the wild type vs knockdown / overexpression conditions is a good screening approach to generate initial hypothesis.

One area that is likely to create new research opportunities is long noncoding RNA (lncRNA). Theoretically CGH and SNP studies should have information on their copy number changes, except that previously people didn’t realize that they were genes. In addition to using RNA-seq and DNase-seq to investigate their function, one informative experiment might be to use oligo probes to specifically pull down the lncRNA and mass spec to study the proteins that interact with it. John Rinn seems to have some expertise in this area, and we should also explore this technique.

If enough tumors have been sequenced, and still people only observe point mutations but not copy number variations, it would indicate the mutation is not having weaker or stronger regulation of existing network of genes. The reason is that tumors could increase or decrease copy numbers to achieve similar goals of exerting stronger and weaker regulation. Instead, the mutation must be creating new links in the regulatory network. This type of gain of function mutations could be investigated by knocking in genes carrying the specific mutation, and examining its downstream consequences. This is not a trivial experiment, and we might need to think of more efficient ways to study these mutations.

May 062012

My husband recommended me to check out the Kahn Academy. The founder, Salman Kahn, a graduate of MIT and HBS, started tutoring his cousin and decided to put his videos online. He got overwhelmingly good feedback, and decided to quit his full time job as a hedge fund analyst to work full time on online education. For more information, check out wikipedia.

I checked out the website, and the content is quite amazing. They have over 3,100 short videos (usually 5-15 min) teaching contents from elementary math, to art history and finance. Kahn’s style is very interesting and leisurely. You can learn and also get entertained. I always admire people who can teach and make it fun. There are also very interesting and creative videos like the doodling math on spirals and Fibonacci number. I probably learned Fibonacci number a number of times in my life. But watching this video makes the content unforgettable for the rest of my life. Technology is changing the world and the way we learn!