Developer Guide¶
This guide is for developers who want to contribute to the Gaussian Extractor project. It covers the codebase structure, development setup, coding standards, and contribution guidelines.
Project Overview¶
Architecture¶
Gaussian Extractor is a C++20 application designed for high-performance processing of Gaussian computational chemistry log files. The codebase follows a modular architecture with clear separation of concerns:
├── src/ │ ├── main.cpp # Application entry point │ ├── extraction/ │ │ ├── coord_extractor.cpp │ │ ├── coord_extractor.h │ │ ├── gaussian_extractor.cpp │ │ └── gaussian_extractor.h │ ├── high_level/ │ │ ├── high_level_energy.cpp │ │ └── high_level_energy.h │ ├── input_gen/ │ │ ├── create_input.cpp │ │ ├── create_input.h │ │ ├── parameter_parser.cpp │ │ └── parameter_parser.h │ ├── job_management/ │ │ ├── job_checker.cpp │ │ ├── job_checker.h │ │ ├── job_scheduler.cpp │ │ └── job_scheduler.h │ ├── ui/ │ │ ├── help_utils.cpp │ │ ├── help_utils.h │ │ ├── interactive_mode.cpp │ │ └── interactive_mode.h │ └── utilities/ │ ├── command_system.cpp │ ├── command_system.h │ ├── config_manager.cpp │ ├── config_manager.h │ ├── metadata.cpp │ ├── metadata.h │ ├── module_executor.cpp │ ├── module_executor.h │ ├── utils.cpp │ ├── utils.h │ └── version.h ├── tests/ ├── docs/ ├── resources/ ├── CMakeLists.txt # CMake build configuration ├── Doxyfile # Doxygen configuration ├── LICENSE # Project license ├── Makefile # Make build system └── README.MD # User documentation
New Modules in v0.5.0¶
- Interactive Mode (interactive_mode.h/.cpp)
Windows-specific interactive interface
Menu-driven command selection
Automatic extraction before entering interactive mode
- Coordinate Processing (coord_extractor.h/.cpp)
Extract final Cartesian coordinates from log files
XYZ format conversion and organization
Support for completed and running job separation
- Input Generation (create_input.h/.cpp)
Generate Gaussian input files from XYZ coordinates
Template system for reusable parameter sets
Support for multiple calculation types (SP, OPT, TS, IRC)
- High-Level Energy Calculations (high_level_energy.h/.cpp)
Combine high-level electronic energies with low-level thermal corrections
Support for kJ/mol and atomic unit outputs
Directory-based energy combination workflow
- Job Status Management (job_checker.h/.cpp)
Comprehensive job status checking and organization
Support for multiple error types (PCM, imaginary frequencies)
Automated file organization by job status
- Metadata Handling (metadata.h/.cpp)
File metadata extraction and validation
Job completion status detection
File size and timestamp tracking
- Parameter File Parsing (parameter_parser.h/.cpp)
Template parameter file parsing
Configuration file format support
Validation and error reporting
Key Design Principles¶
- Modularity
Each module has a single responsibility
Clear interfaces between components
Easy to test and maintain
- Performance
Multi-threaded processing
Memory-efficient algorithms
Cluster-aware resource management
- Safety
Comprehensive error handling
Resource cleanup on failures
Graceful shutdown handling
- Usability
Intuitive command-line interface
Extensive help system
Configuration file support
Development Setup¶
Prerequisites¶
Required Tools:
C++ Compiler: GCC 10+, Intel oneAPI, or Clang 10+
Build System: Make (included with most Linux distributions)
Documentation: Sphinx (for building documentation)
Git: Version control system
Optional Tools:
CMake: Alternative build system
Doxygen: API documentation generation
Valgrind: Memory debugging
Clang-Tidy: Code analysis
Getting the Source Code¶
# Clone the repository
git clone https://github.com/lenhanpham/gaussian-extractor.git
cd gaussian-extractor
# Create a development branch
git checkout -b feature/your-feature-name
Building for Development¶
Debug Build:
# Build with debug symbols and safety checks
make debug
# Or with CMake
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
make
Release Build:
# Optimized release build
make release
# Or with CMake
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
Development Build with All Features:
# Full development build
make -j $(nproc)
Testing¶
Running Tests:
# Build and run tests
make test
# Run specific test suite
./test_runner --suite extraction_tests
# Run with verbose output
./test_runner -v
Test Coverage:
# Generate coverage report
make coverage
# View coverage in browser
firefox coverage_report/index.html
Code Quality Tools¶
Static Analysis:
# Run clang-tidy
clang-tidy src/core/*.cpp -- -std=c++20 -Isrc
# Run cppcheck
cppcheck --enable=all --std=c++20 src/
Code Formatting:
# Format code with clang-format
find src/ -name "*.cpp" -o -name "*.h" | xargs clang-format -i
# Check formatting
find src/ -name "*.cpp" -o -name "*.h" | xargs clang-format --dry-run -Werror
Documentation¶
Building Documentation:
# Install Sphinx
pip install sphinx sphinx-rtd-theme
# Build HTML documentation
cd docs
make html
# View documentation
firefox _build/html/index.html
API Documentation:
# Generate Doxygen documentation
doxygen Doxyfile
# View API docs
firefox doxygen/html/index.html
Coding Standards¶
Code Style¶
Naming Conventions:
// Classes and structs
class CommandParser;
struct CommandContext;
// Functions and methods
void parse_command_line(int argc, char* argv[]);
CommandContext create_context();
// Variables
int thread_count;
std::string output_file;
// Constants
const int DEFAULT_THREAD_COUNT = 4;
const std::string CONFIG_FILE_NAME = ".gaussian_extractor.conf";
// Member variables (with m_ prefix)
class MyClass {
private:
int m_thread_count;
std::string m_config_file;
};
File Organization:
Headers (.h): Class declarations, function prototypes, constants
Implementations (.cpp): Function definitions, implementation details
One class per file when possible
Related functionality grouped in modules
Documentation Standards¶
Doxygen Comments:
/**
* @brief Brief description of the function/class
*
* Detailed description explaining what the function does,
* its parameters, return values, and any important notes.
*
* @param param1 Description of first parameter
* @param param2 Description of second parameter
* @return Description of return value
*
* @section Usage Example
* @code
* // Example usage
* int result = my_function(param1, param2);
* @endcode
*
* @note Important notes about usage or limitations
* @warning Warnings about potential issues
* @see Related functions or classes
*/
int my_function(int param1, const std::string& param2);
Inline Comments:
// Use comments for complex logic
if (condition) {
// Explain why this condition is important
do_something();
}
// Use TODO comments for future improvements
// TODO: Optimize this loop for better performance
Error Handling¶
Exception Safety:
try {
// Operation that might fail
process_files(file_list);
} catch (const std::invalid_argument& e) {
// Handle invalid arguments
std::cerr << "Invalid argument: " << e.what() << std::endl;
return 1;
} catch (const std::runtime_error& e) {
// Handle runtime errors
std::cerr << "Runtime error: " << e.what() << std::endl;
return 2;
} catch (const std::exception& e) {
// Handle all other exceptions
std::cerr << "Unexpected error: " << e.what() << std::endl;
return 3;
}
Return Codes:
/**
* @return 0 on success
* @return 1 on general error
* @return 2 on invalid arguments
* @return 3 on resource unavailable
* @return 4 on operation interrupted
*/
int process_data(const std::string& input_file);
Memory Management¶
RAII Pattern:
class FileProcessor {
public:
FileProcessor(const std::string& filename)
: m_file(filename) {
if (!m_file.is_open()) {
throw std::runtime_error("Failed to open file");
}
}
~FileProcessor() {
// Automatic cleanup
if (m_file.is_open()) {
m_file.close();
}
}
private:
std::ifstream m_file;
};
Smart Pointers:
// Use unique_ptr for exclusive ownership
std::unique_ptr<CommandContext> context = std::make_unique<CommandContext>();
// Use shared_ptr for shared ownership
std::shared_ptr<ConfigManager> config = std::make_shared<ConfigManager>();
Thread Safety¶
Thread-Safe Classes:
class ThreadSafeCounter {
public:
void increment() {
std::lock_guard<std::mutex> lock(m_mutex);
++m_count;
}
int get_count() const {
std::lock_guard<std::mutex> lock(m_mutex);
return m_count;
}
private:
mutable std::mutex m_mutex;
int m_count{0};
};
Threading Guidelines:
Document thread safety guarantees
Use appropriate synchronization primitives
Avoid global mutable state
Test concurrent access patterns
Contributing¶
Development Workflow¶
1. Choose an Issue:
# Check available issues
# Visit: https://github.com/lenhanpham/gaussian-extractor
2. Create a Branch:
# Create and switch to feature branch
git checkout -b feature/descriptive-name
# Or for bug fixes
git checkout -b bugfix/issue-number-description
3. Make Changes:
# Make your changes following coding standards
# Add tests for new functionality
# Update documentation as needed
4. Test Your Changes:
# Build and test
make debug
make test
# Run code quality checks
make lint
5. Commit Your Changes:
# Stage your changes
git add .
# Commit with descriptive message
git commit -m "feat: add new feature description
- What was changed
- Why it was changed
- How it was tested"
6. Push and Create Pull Request:
# Push your branch
git push origin feature/your-feature-name
# Create pull request on GitHub
Pull Request Guidelines¶
PR Title Format:
type(scope): description
Types: feat, fix, docs, style, refactor, test, chore
PR Description Template:
## Description
Brief description of the changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Unit tests added/updated
- [ ] Integration tests added/updated
- [ ] Manual testing performed
## Checklist
- [ ] Code follows style guidelines
- [ ] Documentation updated
- [ ] Tests pass
- [ ] No breaking changes
Code Review Process¶
Review Checklist:
[ ] Code follows established patterns
[ ] Appropriate error handling
[ ] Thread safety considerations
[ ] Performance implications
[ ] Documentation updated
[ ] Tests included
[ ] No security vulnerabilities
Review Comments:
Be constructive and specific
Suggest improvements, don’t just point out problems
Reference coding standards when applicable
Acknowledge good practices
Testing Guidelines¶
Unit Testing¶
Test Structure:
#include <gtest/gtest.h>
#include "core/command_system.h"
class CommandParserTest : public ::testing::Test {
protected:
void SetUp() override {
// Setup code
}
void TearDown() override {
// Cleanup code
}
};
TEST_F(CommandParserTest, ParseExtractCommand) {
// Test extract command parsing
char* argv[] = {"gaussian_extractor.x", "extract", "-t", "300"};
CommandContext context = CommandParser::parse(4, argv);
EXPECT_EQ(context.command, CommandType::EXTRACT);
EXPECT_EQ(context.temp, 300.0);
}
Running Tests:
# Run all tests
make test
# Run specific test
./test_runner --gtest_filter=CommandParserTest.ParseExtractCommand
# Run with coverage
make coverage
Integration Testing¶
End-to-End Tests:
# Test complete workflows
./test_integration.sh
# Test with sample data
./gaussian_extractor.x -f test_data/ --output test_results/
Performance Testing¶
Benchmarking:
# Run performance benchmarks
make benchmark
# Profile application
valgrind --tool=callgrind ./gaussian_extractor.x [args]
# Memory profiling
valgrind --tool=massif ./gaussian_extractor.x [args]
Continuous Integration¶
CI/CD Pipeline¶
Automated Testing:
Build: Compile on multiple platforms (Linux, Windows)
Test: Run unit and integration tests
Lint: Code quality checks
Docs: Build documentation
Release: Automated releases
GitHub Actions Workflow:
name: CI
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Build
run: make -j 4
- name: Test
run: make test
- name: Lint
run: make lint
Release Process¶
Version Numbering¶
Semantic Versioning:
MAJOR.MINOR.PATCH
- MAJOR: Breaking changes
- MINOR: New features (backward compatible)
- PATCH: Bug fixes (backward compatible)
Release Checklist:
[ ] Update version in version.h
[ ] Update CHANGELOG.md
[ ] Update documentation
[ ] Create release branch
[ ] Run full test suite
[ ] Create GitHub release
[ ] Update package repositories
Release Commands:
# Create release branch
git checkout -b release/v1.2.3
# Update version
echo "1.2.3" > VERSION
# Commit and tag
git add VERSION
git commit -m "Release v1.2.3"
git tag -a v1.2.3 -m "Release v1.2.3"
# Push release
git push origin release/v1.2.3
git push origin v1.2.3
Support and Communication¶
Communication Channels:
GitHub Issues: Bug reports and feature requests
GitHub Discussions: General questions and discussions
Pull Request Comments: Code review discussions
Getting Help:
Check existing issues and documentation first
Use descriptive titles for issues
Provide minimal reproducible examples
Include system information and versions
Community Guidelines:
Be respectful and constructive
Help newcomers learn and contribute
Follow the code of conduct
Acknowledge contributions from others
This developer guide provides comprehensive information for contributing to the Gaussian Extractor project. Following these guidelines ensures high-quality, maintainable code that benefits the entire community.