Part 1: Scientific insight¶
Memory is an instance in which an organism's current behavior is determined by some aspect of its previous experience30. Memory an emergent property that researchers observe as an overt change at one or more levels of biological organization. Identifying the molecular basis of individual phenotypic variation is a major challenge for modern biologists. This research provides fundamental new insights into how transcriptional variation arises both within the subfields of the hippocampus that are crucial for learning, memory, and cognition. As a whole, my research provides a thorough analysis of transcriptional activity in the hippocampus that confirms many known properties and suggests new avenues for research.
What I found¶
I found that the Dentate Gyrus (DG) subfield responses with a strong upregulation of transcription factors to consistently-reinforced active place avoidance training. The CA1 transcriptomes reveal the expression of genes related to synaptic membrane activity in response to both cognitive and stressful manipulations. My results confirm that the well-known “memory molecules” continue to appear on gene lists as essential, but my research builds on this prior work by identifying new genes who activity is related to processes involved in learning and memory. There appears to be robust sub-field specific difference that may reflect constitutive processes that distinguish the functional classes of neurons that make up each subfield. It will be interesting to see the extent to which other behavioral manipulations influence plasticity in gene expression.
I found that knocking out the FMR1 gene (which is the leading known genetic cause of autism) causes a change in the strategy that mice use to solve place avoidance tasks, and it disrupts the expression of about 100 genes. Among the list of affected processes, calcium signaling appears to be downregulated, which may provide insight into how neural function, in general, is altered in the FMR1 knock out brain.
Using automated computing and statistical analyses, I have developed an approach to investigate stable and plastic patterns of gene expression in the hippocampus. Using my data in combination with publically available data, I identified genes that reproducibly distinguish subfields of the hippocampus. I identified genes that respond indifferently to different types of manipulation and those that response only to specific treatments. This study is a significant contribution to our understanding of how experimental techniques may or may not mask our ability to identify biologically meaningful signatures of gene expression. Given who much I integrated public data into my analysis pipeline, I hope that future research can benefit from incorporating my now public data into their research for continued use and discovery.
What I learned¶
One idea that emerged from synthesizing the information of my thesis as a whole rather than each experiment individually is that stability is more common than plasticity in gene expression in brain expression. In other words, I expected to find thousands of genes showing significantly different expression with every manipulation, but what I found was that the effect of manipulation was much smaller than the gene expression differences that give rise to neuron-specific and region-specific patterns of expression in the brain (Chapter, 1, 2, and 3). I think reading the literature biased me always to expect to find large, significant, meaningful patterns of expression that were evidence of transcriptomic plasticity underlying behavioral plasticity. However, when I look at what explains most of the variation in expression, mostly its brain structure and cellular makeup that control differences, not acute response to experimental manipulations. A time series analysis could be the next step to examine how much transcriptomic dynamics there is at various milestones of the learning and memory formation process. Or, one could block transcription at different times during learning and memory formation to better understand how transcription influences learning or how laboratory manipulations interact to alter gene alter expression.
Implications for broader scientific disciplines¶
The research questions, approaches, and conclusions described in this thesis are useful in part to scientists and educators of all disciplines. My thesis research questions have their roots in behavioral neuroscience and integrative biology. I drew upon literature describing the neural circuitry of the hippocampus and its contribution to spatial learning and memory discrimination. I complement this logic with that of the brain and behavior schools of thought in integrative biology where one describes the wealth of behavioral differences between two populations and tries to identify the underlying genomic contribution to variation in behavior. I hope to see more cross-talk and collaboration between scientists who reduce their focus to deliberate manipulation of discrete molecular and those who study the random forces of evolutionary biology when trying to understand brain function.
Vision for reproducible research¶
In the future, I hope to see more large-scale databases of publically available gene expression data that facilitate meta-analysis of new and old data to test for reproducibility. Even though conducted three distinct experiments, I found it very useful to repeatedly analyze the data from all three thesis chapters to look for anomalies or to control for variables that were unexpectedly not controlled for by the experimental design. For the first time, I was also able to analyze my data together with publically available data to confirm general patterns of transcription in the hippocampus. This meta-analysis (presented in Chapter 3) was relatively simple once the data were formatted such that similar cell or tissue types could be compared against explanatory variables of interest. I would like to see more meta-analyses like this in the future, where scientists analyze primary and secondary data in the same paper. Journals like eLife have pioneered a platform where the reproduction of secondary data can be submitted as a stand-alone article, and badge systems are being developed to demonstrate when new research provides evidence that a previous study has been reproduced or replicated. With good good data management practices, I can look for patterns that were consistent and robust over time compared to those that are more variable. As I learned how to code with increasing elegance and skill, I continued to revisit old data with new emerging technology and with ever-evolving color palettes. In a supplementary analysis, I conducted a meta-analysis of behavior, looking for robust patterns of memory across experiments, years, researchers, and laboratories (Figs. A5, A6). This type of analysis gave me further insight into the biological meaning of the data and sparked new conversation with colleagues.
Part II: Professional insight¶
The impact of community-developed products and practices¶
When I started out with my thesis research, I did not realize the extent to which the availability of open-source software tools and hardware for bioinformatics would impact the reproducibility of my workflow. Even though my computational workflows are publically available, and they have been reused by undergraduates at my university, these workflows are designed to be run on a specific cluster at the Texas Advanced Computing Facility, that has been turbo-charged with in-house made scripts and programs made available through the BioITeam in coordination with the Center for Computational Biology and Bioinformatics at UT Austin. Thus, the computationally-expensive methods that I used are only useful to a specific community of academics. However, after I have wrangled and reduced the size of my next-generation sequencing data, I am empowered by the plethora of open source software tools designed by public and volunteer organizations such as the NCBI, Ensembl, the R User Community, Jupyter Org. This is a collaborative space where technologies grow and develop inside user communities, but they are used and contributed to by a global audience. Based on the foundation of knowledge, tools, and tutorials available on the internet, I was able to create a well-documented and reproducible workflow for gene expression analysis. The workflow is partially modularized so that you can mix and match programs to fit the needs of the experiment. I never knew how much I would enjoy the implementation of bioinformatics and data manipulation, and I am grateful to all the scientists who are committed to sharing their research outputs to support the scientific endeavors of other colleagues.
The impact of research-driven education¶
As an approach, both my laboratory and computational work was embedded in non-traditional classrooms. This means that I have six years of experience observing how experiments are carried out in research laboratories and teaching laboratories. All of my wet-laboratory experiments were conducted during June and July 2015 and July 2016 in conjunction with the Neural Systems & Behavior. For eight weeks each summer, I attended lectures in the morning and conducted research in a laboratory from noon to 10 pm. For two of those weeks, I also co-taught and mentored students in laboratory and bioinformatic techniques in integrative molecular neuroscience. Given that I have used my data to answer research questions and as a tool in the classroom, I believe that the resources and approach described in my thesis will be of relevance to other researchers and educators working with RNA sequencing data in particular or biological data in general. Additionally, I have seen first-hand what strategies work for teaching students the process of molecular biology rather than merely showing them how to follow recipes. I have also observed first-hand the success of short-term intense research education experiences. The courses offered by Cold Spring Harbor and the Marine Biological laboratory are crucial for helping students gain technical skills, knowledge, and a network of like-minded researchers128. Of course, semester-long university courses are an essential part of education, but I believe that graduate students should have all the financial support available to gain training during the summer and holiday breaks in the academic year.
Importantly, the bioinformatic approaches I adopted also stemmed from being embedded in the non-traditional teaching communities known as Software Carpentry and Data Carpentry. Since 2015, I have undergone a dramatic transformation by learning how to use and teach multiple programing languages and best practices for data analysis in the modern biology era. As a student in their workshops, I learned how bioinformatic programs and workflows were built and how to use them to conduct my research. Once I was certified as an instructor, I passed on this knowledge to classrooms of motivated learners whose research is now enhanced because of the experience129. As an instructor-trainer, I learned how to build classrooms where diverse students can learn to the best of their capabilities130–132. Taking all of this together, I have developed a practice of learning through teaching and teaching while learning that helps me adapt what I know to current tasks. This has been especially true for my teaching appointment in biostatistics this final semester of my graduate education. I am using teaching tools from my instructor training kit to create a positive learning environment (code of conduct, minute cards for immediate feedback, live coding, and providing real-world examples) and my practical experience using R to troubleshoot errors and explain concepts. That is what I give to my classroom, but I also gain a lot. I have learned more about hypotheses testing and statistical tests than I thought was possible, and I have been able to implement these approaches into my final workflow, just in time for the culmination of my thesis. My advice to all graduate students is to teach or assist with a statistics course at some point during their training to acquire these valuable skills.
As behavioral neuroscientists in search of molecular signatures of memory, it is important to remember that “A mouse cannot tell you if it remembers what it had for breakfast.” I now know how important it is to remember that not all behavioral displays can be easily interpreted, and not all memories manifest themselves with behavior. In practice, this means that multiple measures of activity must be examined in combination to fully characterize it’s behavioral and the neural process that given the expression of that behavior. It is also important that as teachers and mentors, we should not lose sight of the importance of learning and reinforcement as part of their development into scientists and educators of the future. In the same way that conflict training has less effect on gene expression in the hippocampus, so too does consistent reinforcement lead to better learning in the classroom or educational research laboratory.
In conclusion, I hope that the knowledge gained from my thesis research will be used to improve interdisciplinary graduate training programs and to increase the prevalence of reproducible research and open science practices in neuroscience, integrative biology, and beyond.