Foldit Wiki
Advertisement
Jmol 3AXM

PDB entry 3AXM seen in Jmol with a cartoon view.

This article provides a very brief introduction to getting started, hardware requirements, finding and downloading free protein models, and basic navigation for various viewers.

Examining how real proteins are modeled can help novices to intuitively learn how the various 3D protein structures are naturally positioned and arranged in the real world. Many scientists offer their research free of charge on the Internet for anyone to download and study, and the tools to view and examine proteins are also often free.

Some of the example objects are ridiculously huge in the tens of thousands of atoms, and so are not offered as a complete structural download since a supercomputer would be needed to model it fully, and not everyone can afford that.

Instead, the huge objects are usually constructed of smaller protein blocks that repeat over and over, and it is these sections that are offered for download.

Viewing a PDB[]

PDB files are nothing more than simple text files that describe all atoms in a structure, and which list the interconnections between each atom. The three letter extension generally identifies it as a protein structure to be opened with a special viewer. If a viewer such as Avogadro or Chimera is installed, usually a PDB will be automatically associated with the viewer and will auto-open as a 3D view.

The PDB may also contain heading information that is plain text and is directly readable from any simple text editor such as Windows Notepad. However, a PDB over 500 kilobytes can contain thousands of lines of data, and Windows Notepad can seem to hang (Not responding) for several seconds to several minutes while an extremely large text file is opened.

Small proteins can load quickly into the viewer, but the load time increases dramatically as the number of atoms increases. It is normal to try to open a protein in a viewer and for the viewer to seem to hang with (Not responding) in the Windows title bar. It is working hard, so please wait patiently.

Finding proteins to view[]

There are a number of free resources available, and some are better suited for general access by students and the public than others. The naming of proteins can be arcane and difficult for a newcomer to understand, but there are websites that have made an attempt to highlight some of the more interesting proteins and structures that have been investigated.

RCSB-PDB Molecule of the Month[]

The Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) provides a wealth of easy to access molecular models. The PDB site is updated each week on or about Wednesday 00:00 UTC (Coordinated Universal Time) with new entries, modified entries, and updated status information.

RSCB Main webpage:

RSCB Molecule of the Month:

Viewing RCSB proteins[]

For a typical example, such as the Tobacco Mosaic Virus, click on the image or article name to bring up the detailed description of the virus. From here, the text usually refers to what seem to be randomly named strings of numbers and letters, which are how proteins are named. Click on the protein name to bring up a page devoted to describing it.

Protein 2tmv in the Tobacco Mosaic Virus:

On the protein page at the top is the protein name, and a row of icons, one with a downward pointing arrow. Click on this icon to download the PDB file.

Download PDB file for protein: 2tmv http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdb&compression=NO&structureId=2TMV

Next to it is an icon that looks like a page of text. This allows you to view the text of the PDB file directly in your web browser.

View text of PDB for protein: 2tmv'

To find the native of an old puzzle or if an existing protein match your design[]

Run this recipe  , take the sequence of one letter and search in PDB in advanced search by sequence.

Protein Classification[]

Browse proteins by structure

Scop: http://scop.berkeley.edu/sunid=0

Cath: http://www.cathdb.info/browse/browse_hierarchy_sunburst

Atlas of Macromolecules[]

The University of Massachusetts Amherst microbiology department has created an Atlas of Macromolecules to document some of the more interesting molecular structures.

Main webpage:

Molecular visualization resources by Eric Martz:

Basic Viewer navigation[]

This is a short introduction to each viewer, to permit a novice to jump in feet first and quickly learn to navigate the structure and to view it in different ways.

Avogadro[]

Avogadro does not appear to offer all the abstract viewing options for proteins like some other viewers. However it is very easy to learn and may be useful as a stepping stone to other more complex viewers.

Software needed[]

Currently Foldit does not allow the importing and viewing of real protein models. Instead an external editor/viewer is needed. There are a number of molecular modeling tools available for various hardware platforms. Some are free, and some are commercial.

Program/Project Windows Linux Mac OSX Cost Notes
Avogadro

http://avogadro.openmolecules.net/

Yes Yes Yes Free  
DeepView / Swiss-PdbViewer

http://spdbv.vital-it.ch/

Yes No* Yes Free Works in Linux via Wine
JMol

http://molvis.sdsc.edu/fgij/index.htm

Yes Yes Yes Free Works instantly in any browser with Java installed
NCSF Chimera

http://www.cgl.ucsf.edu/chimera/

Yes Yes Yes Free  
Protein Explorer

http://www.umass.edu/microbio/chime/pe_beta/pe/protexpl/frntdoor.htm

Yes No No Free No longer supported as of Oct 2008

Other modeling software[]

A huge list of many more modelers is available, though some of these are not 3D viewers and not easy to quickly pick up and use:

World Index of BioMolecular Visualization Resources / Free Molecular Visualization and Modeling Software

Hardware needed[]

The computing power needed for molecular modeling tools can be deceptive, because for simple molecular structures of 25 atoms or less, the modeling tools often run easily on just about any computer available, including older systems that have a minimal amount of memory, a slow processor, and have only a generic non-3D graphics system such as Intel GMA.

However, protein modeling can venture into thousands of atoms, and have multiple proteins that interlock together to form a much larger structure. This can bring a slow PC to its knees, due to the large memory requirements to hold the complex structures in memory, and the need for a high-proced OpenGL-capable 3D card to handle the complex job of drawing thousands of atoms onscreen at the same time. Just simply rotating the model can be tortuously slow, without even attempting to modify the structures.

To help simplify matters the hardware is being broken down into four classes of systems, to give a general idea of what is needed for various levels of molecular modeling.

Generic desktop PC[]

This is a typical generic office PC from about 2006. It is good enough for modeling up to about 30 atoms before it starts to get sluggish.

Processor Intel Celeron 2 ghz, Intel Pentium 4 1.5 ghz
Memory 512 megabytes PC133 or DDR-266
Video No 3D capability, Intel GMA 850

Minimalist system[]

This is a typical gaming PC from about 2007. It is good enough for modeling up to about 250 atoms before it starts to get sluggish. Gaming PCs include 3D cards which can do OpenGL but are not specifically optimized for the needs of scientific/engineering OpenGL modeling.

Processor Intel Pentium 4 3.0 ghz
Memory 1 gigabyte DDR-400
Video nVidia 256 megabyte 7800GT

Good performance system[]

This is a recent-build desktop PC gaming platform. It is good enough for modeling up to about 1000 atoms before it starts to get sluggish. Newer gaming 3D cards have better OpenGL support but still are not specifically optimized for the needs of scientific/engineering OpenGL modeling.

Processor Intel Core 2 Duo 2.5 ghz
Memory 4 gigabytes DDR2-800
Video Dual nVidia 512 megabyte 8800GT in SLI

High performance system[]

This is a midrange workstation. It is good enough for modeling up to about 3000 atoms before it starts to get sluggish. This uses an actual OpenGL-optimized 3D card specifically built for the needs of scientific/engineering OpenGL modeling, and the card may cost more than the rest of the computer.

Processor Intel Core 2 Quad 3.0 ghz
Memory 8 gigabytes DDR2-800
Video ATI FireGL 512 megabyte V7600
Advertisement