I was at AITHYRA Symposium “AI for Life Science” in Vienna and enjoyed it a lot! I met many cool people and found out about different labs and projects people are working on. I put some quick thoughts on X right away, here’s a bit longer post about it
On data
- The biggest takeaway for me was the importance of data collection and datasets in general. There were several talk specifically about it, and lots of other talks mentioned good dataset as their biggest success factor. Every time you work on something novel - try to create a dataset out of your findings (and upload on HuggingFace for a better discoverability). You never know when such datasets will become useful, AlphaFold was only successful because of the existing PDB.
- You can collect a dataset with an idea of usage in mind. For example, there was a talk describing how people thought that fungi evolve lots of different chemicals to fight with toxins, so scientists went to obscure places to collect fungi data and use it as a starter for new drug developments. Another example was going close to a volcano to find enzymes that can do very specific things, and use directed evolution to make more useful enzymes.
- Another addition to my Project Ideas: parse latest papers, extract dataset links, use AI agents to upload them to HF
- Also some “solved” and boring problems are not actually solved, they just use toy datasets. Creating new better out-of-distribution datasets can create new opportunities!
- Similarly, not enough people are thinking about how to encode all these datasets to put into computer. Nowadays people just play with different tokenization, but there’s more to it!
On models in general
- Coming back to evolution, it can explain a lot in the way biology and proteins work, and we should use this to improve our models. I think this is what Evoformer in AlphaFold model does, and it’s good to keep it in mind while working on life science research
- Also AlphaFold 3 works great and other models still struggle to improve on it, but it’s inference time is slow. When it’s improved, we may see a similar advancement in test-time compute as we see in LLMs
- Some chemical algorithms are designed in 80s for the old compute power but still used today. We should be able to dissect old methods and see how we can reinvent them, maybe we can even automate it with AI (plugging my Paper To Project here)
- There’s a way to improve models by adding additional constraints from physics like x-ray info and [cryo-EM](https://www.owlposting.com/p/a-primer-on-ml-in-cryo-electron-microscopy
On processes
- Lots of people complained about processes! Some didn’t like grants (good thing there are new ways of funding like FROs), others politics, others just wanted to do science and don’t bother with all organizational work like setting up Randomized Control Trials. People try to improve there, outsource and collaborate with LLMs, but it’s either too expensive, or not reliable enough. AI is not good because people have to teach it the same things every time, and it’s just faster to do it yourself.
- Some scientists are not using Research Assistant AI Tools at all! Some haven’t even heard about them, others tried but they were not good enough. These tools usually missing some simple but important functionality, like restricting semantic paper search only to specific journals.
- Another addition to my Project Ideas: AI lab notebook. With automatic templates tailored for you for your hypothesis, automatic data collection, connection to your tools etc. You just write about your days, ideas and results, it creates your story for you and writes your papers. AI lab notebook: don’t just publish papers, tell your journey xD
- Also about tools: there are too many of them in biology with a similar functionality. Some try to fix it using ToolUniverse, but it’s hard. Is there the same problem with MCP servers?
- Also lots of robots and lab equipment sits there underutilized, can it be fixed? Why is there not enough specialization where you just outsource your experiments to a robotic factory instead?
- And lastly, labs and companies are not that important, people are the most important part of a career. It’s best to find people striving for the same results and working hard, join them and have a great work environment. Success is being excited to go to work and being excited to come home.