Contents:
The morloc
alternative
In the morloc
library, all file support will be removed. The idea is the tRNA
prediction library should only predict tRNA. It should not have to support FASTA
or GENBANK parsing. Functions in general purposes libraries can read from and
write to storage formats, if necessary.
What do we want to return?
predictTrna :: Str -> ???
record TRNA = TRNA
{ anticodon :: Int -- the offset of the start of the anticodon
, energy :: Real -- the thermodynamic energy of the tRNA structure as
-- calculated by ARAGORN
-----
, astem1 :: Int
, spacer1 :: Int
, dstem :: Int
, dloop :: Int
, spacer2 :: Int
, cstem :: Int
, cloop :: Int
, intron :: Int -- position of the intron start
, nintron :: Int -- length of the intron
, var :: Int
, tstem :: Int
, tloop :: Int
, astem2 :: Int
}
The TRNA type is “affine”. That is, it is the pattern describing a tRNA that is independent of its origin. It contains no information about the species of origin and thus stores only the location of the anticodon and not the amino acid it codes for, since this may vary depending on the genetic code. It contain no genomic start coordinate and no strand information.
The TRNA type is also minimal. It contains no information that can be derived without rerunning the ARAGORN algorithm. Conspicuously missing is the actual tRNA sequence. This could be included without loss of the affine property. However, this would require performaing work, at a computational and memory expense, that may not be desired by the user and that the user can easily do later.
The TRNA type is also fixed in size. Every type in the structure is of fied with (ints and doubles, in C). Thus the record has fixed size and can be stored very efficiently without having to lookup variable length values like strings.
The goal is to develop a minimal type that ARAGORN can emit that is independent
of anything else in the wider ecosystem. I do not, for example, want to import
some bio
object oriented library and make TRNA a subclass of some generic
Feature object. ARAGORN needs to be timeless and independent.
cca 3' amino-acyl acceptor
5' a NCCA is the canonical motif
g-c
c-g
g-c A-stem
g-c
a-t
g-c
space1 t-a ta
t cgccc a
tga a !!!!! a T-loop
c tttg gcggg c
D-loop t :+!! c tt
g tgac c
gta g g
t-aag var
spacer2 c-g
a-t
g-c
c-g
t a
t a C-loop (anticodon stem-loop)
ccc
\ /
anticodon
morloc
implementation
However, it segfaults when the flanks around the tRNA are too short.
morloc
ecosystem
Making a primordial rna
package.
- searching both strands
- circular search
- codon translation
- mapping
- filtering
- parallelism
- visualization