Problems in optimisation with applications to biology?

I will soon have around 8 weeks to do a research project of my choosing as I am finishing my undergrad in computer science. My experience is mostly in optimisation algorithms (mostly combinatorial/discrete optimisation).

I have an interest in applying computer science to biology (mostly human biology). I would like to ask if anyone knows optimisation problems with applications to biology which would benefit current research.

Although I have read a lot about biology in my free time, I am not too knowledgeable about the current state of the research and so it is a bit hard for me to find a good topic.

Some examples to spark the discussion are string problems in genetics (alignment problems), and community detection in graphs (to find proteins in the same functional group).

Thank you!

Maximum likelihood phylogenetic trees. The trees are easily scored but the tree space is so large that finding optimal phylogenetic trees is extremely difficult. There's some research but there haven't been very many actual breakthroughs, so it's probably a pretty rich field.

Here are some examples:

  • Sequence (DNA, protein) alignment. With the development of next-generation sequencing methods, this is an important and active field.
  • Genome distance calculation by genome rearrangements
  • The phasing problem for single nucleotide polymorphism data (determining haplotypes)
  • Factor blocking problems in statistical experiment design
  • Network (graph) analysis. Many biological data sets can be expressed as networks, for example pairwise protein interactions, which give rise to a variety of discrete problems, like finding maximum connected subgraphs, graph flow problems, etc.
  • Biclustering of data matrices, notably gene expression data
  • Inference on phylogenetic trees was already mentioned.

I'm sure there are many more, this is just off the top of my head. Happy googling! :)

