Homologous Invertebrate Genes Database

     

HOINVGEN release 01

Online on March 2005

Release information: Protein Nucleotide


Data maintenance

Ingo Paulsen
Heinrich-Heine-University Düsseldorf
Institute for Computer Sciences
Department of Bioinformatics

Query Hoinvgen

You can query Hoinvgen using WWW-Query, Quick Search or CrossTaxa.

Alternatively you can use the following links:
Search by Keyword Protein sequences Nucleotide sequences
Search by Sequence Protein sequences Nucleotide sequences


HOINVGEN is a database of homologous invertebrate genes, structured under ACNUC sequence database management system. It allows one to select sets of homologous genes among invertebrate species, and to visualize multiple alignments and phylogenetic trees. Thus HOINVGEN is particularly useful for comparative sequence analysis, phylogeny and molecular evolution studies. More generally, HOINVGEN gives an overall view of what is known about a peculiar gene family.

The version of HOINVGEN

HOINVGEN is build in the same way as HOBACGEN (for more details, click here, or see the paper on HOBACGEN by Perrière et al. Genome Res 2000 10:379-85). Secondly, we developped a new graphical interface (named FamFetch). This interface is written in JAVA and should work on any computer (Mac, PC, UNIX, etc.), and does not require to have the whole database installed locally.

Content

The database itself contains all invertebrate protein sequences from UniProt (SWISS-PROT+TrEMBL), with some data corrected, clarified or completed (notably to address the problem of redundancy and orthology/paralogy) and with some annotation modifications. It contains also all the corresponding nucleotide sequences in EMBL. Homologous proteins are classified into families and multiple alignments and phylogenetic trees are computed for each family. Sequences and related information have been structured in an ACNUC database. The description on how the database is built is available here.

The present version of HOINVGEN is release 01 (March 2005). It has been built using sequences from UniProt 2.5 (SWISS-PROT 44.5 and TrEMBL 27.5) (September 2004). It contains a total of 132,023 protein sequences (and 116,498 nucleic sequences) classified in 10,073 families.

Among all the proteins included in this release, 96,352 (73%) are classified into 10,073 families containing at least two sequences, and 35,672 (27%) partial proteins are not attached to a family.

Graphical User Interface

HOINVGEN interface is based on a client/server architecture. To access the database you only need to install the FamFetch application on your computer. This program, written in Java, integrates a GUI that allows users to easily access and visualize: In FamFetch phylogenetic trees, genes are colored according to the species from which they come. The user can modify the color table according to the taxa (any taxonomic level) he is interested in. This color table is saved in a file of preferences (named .famfetch in UNIX (UNIX, Linux, Mac OS X), FamFetch.Prefs in Mac OS 9 (or earlier), FamFetch.ini in Windows systems). The color table that is installed by default with FamFetch is dedicated to prokaryotes (for the HOBACGEN database). You can replace this preference file by the one we have prepared for invertebrates, that is available here.

WWW access

It is also possible to query the database on this server through the WWW-Query system. Note that HOINVGEN is splitted into two databases on this server: HOINVPROT contains the protein sequences from SWISS-PROT + TrEMBL while HOINVNUCL contains the nucleotide sequences from EMBL.

Server mirroring

You don't need to install the server itself to have HOINVGEN running on your computer as the client is enough for that purpose. On the other hand you may want to set-up your own server in a way to speed up your database access and to propose that service to potential users in your geographic area. To install an HOINVGEN server, you need first to register. Starting from the registering page results, you will have access to the server installation procedure.

The whole database is available from our FTP server at URL:

ftp://pbil.univ-lyon1.fr/pub/hoinvgen_new/
Note that it is much more efficient to use a dedicated FTP client to download the database rather than an Internet Web browser.

Important note: the SWISS-PROT entries such as those found in HOINVGEN are copyrighted. They are produced through a collaboration between the Swiss Institute of Bioinformatics and the European Bioinformatics Institute. There are no restrictions on its use by non-profit institutions as long as its content is in no way modified. Usage by and for commercial entities requires a license agreement (See or send an Email to license@isb-sib.ch).

Contact and reference

If you encounter some problems when installing or using HOINVGEN, please contact Ingo Paulsen or Laurent Duret. Also we welcome any comments or suggestions on the database and/or its interface.


If you have problems or comments...

Back to PBIL home page