|
"RCSB Protein Data Bank (RCSB PDB, http://rcsb.org) is a global online resource that provides access to atomic level information about proteins, nucleic acids, and complex macromolecular assemblies available in the PDB archive through development of tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.
PDB data are crucial to users around the world; our website supports many millions of users each year. PDB data are also redistributed by ~500 external data resources, and stored for reuse inside the firewalls of all major biopharmaceutical companies and many biotechnology companies. The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
Undergraduate researchers would be involved with Python-based data science software development projects that may include: implementing improved data compression algorithms, developing systems for manipulating large amounts of data; creating and expanding upon Python wrappers for RCSB.org APIs (a project described in a the Journal of Molecular Biology paper https://doi.org/10.1016/j.jmb.2025.168970), and/or other data-intensive software development projects.
The RCSB PDB is headquartered within the Institute for Quantitative Biomedicine on the Busch Campus. Faculty members, including the Interim Director of the RAD Collaboratory, have decades of experience working with undergraduate students on projects that have been launched to production for use by millions of users, described in scientific publications, and recognized by awards. Successful students have continued their work in graduate and professional schools or in industry. "
|
Sign in
to view more information about this project.