DNA and group theory
I came up with an idea some time ago. Let me lay it out here for your pleasure. DNA is full of symmetries. What if we used group structure to probe it? Based on this intuition I decided to map each base in DNA to a specific group generator. The choice of generator took some time. I wanted something of dimension 3 or 4 to naturally fit the DNA bases, which are 4 in number. I figured that if the underlying symmetry of DNA bases was captured in the relationships of group generators to each other, this should show it. Finally I decided that I needed a continuous Lie group to better capture the changes on a manifold. I wanted to represent the DNA sequence as a continuous path on a manifold, where each base moves the path in a direction determined by its group generator, rather than treating the sequence as discrete jumps between states. I decided to use SU(2). The generators of this group are the 3 Pauli matrices, and I added the identity.
I was convinced I had something.
I didn’t. Let me list for you the ways that I failed. I first defined the trace as the statistic to look for over the path. Since SU(2) is non-abelian, I figured that multiplying would keep the order in mind, and the trace of the final matrix across a DNA sequence path would be distinct from random. Spoiler alert: it wasn’t. I used all types of sequences — bacterial, fungal, rRNA — and none of them gave me a path distinct from random. I also tried other statistics, to the same result. Whatever symmetry was present in these sequences wasn’t being found by SU(2). So I pivoted. Let’s do V4, I thought.
It worked.
More on this in my next post. For the moment let me leave you with this thought: almost any transformation that preserves local structure but is tested against full randomization will show signal. What do you think?
Comments
Post a Comment