omecp/
reaction_path.rs

1//! Reaction path optimization and coordinate driving methods.
2//!
3//! This module implements various methods for exploring reaction paths and potential
4//! energy surfaces in quantum chemistry calculations. It provides tools for:
5//!
6//! - **Coordinate Driving**: Systematically vary geometric parameters along a reaction coordinate
7//! - **Nudged Elastic Band (NEB)**: Optimize minimum energy paths between reactants and products
8//! - **Path Analysis**: Calculate path statistics and analyze reaction mechanisms
9//! - **Constraint Handling**: Apply geometric constraints during path optimization
10//!
11//! # Theoretical Background
12//!
13//! ## Reaction Coordinates
14//!
15//! A reaction coordinate is a geometric parameter that describes the progress of a
16//! chemical reaction. Common reaction coordinates include:
17//!
18//! - **Bond lengths**: Distance between two atoms (r₁₂)
19//! - **Bond angles**: Angle between three atoms (∠ABC)
20//! - **Dihedral angles**: Torsion angle between four atoms (∠ABCD)
21//!
22//! ## Coordinate Driving
23//!
24//! Coordinate driving systematically varies a chosen reaction coordinate from an
25//! initial value to a final value while optimizing all other degrees of freedom.
26//! This generates a series of geometries that represent a possible reaction path.
27//!
28//! The method uses constrained optimization at each point:
29//! ```text
30//! minimize E(x) subject to g(x) = target_value
31//! ```
32//!
33//! where E(x) is the energy, g(x) is the reaction coordinate, and target_value
34//! varies from start to end.
35//!
36//! ## Nudged Elastic Band (NEB) Method
37//!
38//! The NEB method finds the minimum energy path (MEP) between two known structures
39//! (reactant and product) by optimizing a chain of intermediate images connected
40//! by springs. The method balances two forces:
41//!
42//! 1. **Spring forces**: Keep images evenly distributed along the path
43//! 2. **True forces**: Drive images toward the minimum energy path
44//!
45//! The NEB force on image i is:
46//! ```text
47//! F_i = F_i^spring + F_i^⊥
48//! ```
49//!
50//! where F_i^spring is the spring force along the path tangent and F_i^⊥ is the
51//! component of the true force perpendicular to the path.
52//!
53//! # Applications
54//!
55//! - **Transition State Search**: Find saddle points along reaction paths
56//! - **Reaction Mechanism Elucidation**: Understand how reactions proceed
57//! - **Barrier Height Calculation**: Determine activation energies
58//! - **Conformational Analysis**: Explore molecular conformational changes
59//!
60//! # References
61//!
62//! - Henkelman, G.; Jónsson, H. *J. Chem. Phys.* **2000**, 113, 9978-9985.
63//! - Henkelman, G.; Uberuaga, B. P.; Jónsson, H. *J. Chem. Phys.* **2000**, 113, 9901-9904.
64//! - Sheppard, D.; Terrell, R.; Henkelman, G. *J. Chem. Phys.* **2008**, 128, 134106.
65
66use crate::constraints::{evaluate_constraints, Constraint};
67use crate::geometry::Geometry;
68
69/// Types of geometric coordinates that can be driven during reaction path exploration.
70///
71/// This enum defines the fundamental geometric parameters used in quantum chemistry
72/// to describe molecular structure and reaction progress. Each coordinate type
73/// corresponds to a specific mathematical relationship between atoms.
74///
75/// # Coordinate Definitions
76///
77/// ## Bond Length (r₁₂)
78/// The distance between two atoms, calculated as:
79/// ```text
80/// r₁₂ = |r₂ - r₁| = √[(x₂-x₁)² + (y₂-y₁)² + (z₂-z₁)²]
81/// ```
82///
83/// ## Bond Angle (θ₁₂₃)
84/// The angle between three atoms, calculated using the dot product:
85/// ```text
86/// θ₁₂₃ = arccos[(r₁₂ · r₃₂) / (|r₁₂| × |r₃₂|)]
87/// ```
88/// where r₁₂ = r₁ - r₂ and r₃₂ = r₃ - r₂
89///
90/// ## Dihedral Angle (φ₁₂₃₄)
91/// The torsion angle between four atoms, calculated using cross products:
92/// ```text
93/// φ₁₂₃₄ = arctan2[(r₂₃ · (n₁ × n₂)), (n₁ · n₂)]
94/// ```
95/// where n₁ = r₁₂ × r₂₃ and n₂ = r₂₃ × r₃₄
96///
97/// # Usage in Reaction Path Studies
98///
99/// - **Bond**: Ideal for bond formation/breaking reactions
100/// - **Angle**: Useful for studying angular deformations and ring closures
101/// - **Dihedral**: Essential for conformational changes and rotational barriers
102///
103/// # Examples
104///
105/// ```rust
106/// use omecp::reaction_path::CoordinateType;
107///
108/// // For studying bond dissociation
109/// let bond_coord = CoordinateType::Bond;
110///
111/// // For ring-opening reactions
112/// let angle_coord = CoordinateType::Angle;
113///
114/// // For conformational isomerization
115/// let dihedral_coord = CoordinateType::Dihedral;
116/// ```
117#[derive(Debug, Clone, PartialEq)]
118pub enum CoordinateType {
119    /// Bond length coordinate between two atoms.
120    ///
121    /// Measures the distance between atoms i and j. Units are typically
122    /// in Angstroms (Angstrom) or Bohr radii. This is the most common reaction
123    /// coordinate for bond formation and dissociation processes.
124    Bond,
125
126    /// Bond angle coordinate between three atoms.
127    ///
128    /// Measures the angle formed by atoms i-j-k, where j is the central atom.
129    /// Units are in radians internally, but often displayed in degrees.
130    /// Useful for studying angular deformations and ring strain.
131    Angle,
132
133    /// Dihedral (torsion) angle coordinate between four atoms.
134    ///
135    /// Measures the torsion angle around the j-k bond in the sequence i-j-k-l.
136    /// Units are in radians internally. Essential for studying conformational
137    /// changes, rotational barriers, and stereochemical transformations.
138    Dihedral,
139}
140
141/// Represents a specific geometric coordinate to drive during reaction path exploration.
142///
143/// This struct encapsulates all information needed to define and manipulate a
144/// reaction coordinate during path optimization or coordinate driving procedures.
145/// It serves as the fundamental building block for reaction path methods.
146///
147/// # Coordinate Specification
148///
149/// The coordinate is fully specified by:
150/// 1. **Type**: Bond, angle, or dihedral (determines the mathematical formula)
151/// 2. **Atoms**: The specific atoms involved (defines which atoms to measure)
152/// 3. **Target**: The desired value to drive the coordinate toward
153///
154/// # Atom Indexing Convention
155///
156/// All atom indices are **0-based** (first atom is index 0). The number of atoms
157/// required depends on the coordinate type:
158///
159/// - **Bond**: 2 atoms `[i, j]` - distance between atoms i and j
160/// - **Angle**: 3 atoms `[i, j, k]` - angle i-j-k with j as vertex
161/// - **Dihedral**: 4 atoms `[i, j, k, l]` - torsion around j-k bond
162///
163/// # Units and Conventions
164///
165/// - **Bond lengths**: Angstroms (Angstrom) - typical range 0.5-5.0 Angstrom
166/// - **Angles**: Radians internally - typical range 0 to π (0° to 180°)
167/// - **Dihedrals**: Radians internally - range -π to π (-180° to 180°)
168///
169/// # Examples
170///
171/// ```rust
172/// use omecp::reaction_path::{DriveCoordinate, CoordinateType};
173///
174/// // C-H bond dissociation: drive from 1.1 Angstrom to 3.0 Angstrom
175/// let bond_breaking = DriveCoordinate::new(
176///     CoordinateType::Bond,
177///     vec![0, 1],  // Carbon (0) to Hydrogen (1)
178///     3.0          // Target distance in Angstroms
179/// );
180///
181/// // Ring opening: drive C-C-C angle from 60° to 120°
182/// let ring_opening = DriveCoordinate::new(
183///     CoordinateType::Angle,
184///     vec![0, 1, 2],           // Atoms forming the angle
185///     120.0_f64.to_radians()   // Target angle in radians
186/// );
187///
188/// // Conformational change: rotate around C-C bond
189/// let rotation = DriveCoordinate::new(
190///     CoordinateType::Dihedral,
191///     vec![0, 1, 2, 3],        // Four atoms defining dihedral
192///     180.0_f64.to_radians()   // Target dihedral in radians
193/// );
194/// ```
195#[derive(Debug, Clone)]
196pub struct DriveCoordinate {
197    /// The type of geometric coordinate being driven.
198    ///
199    /// Determines the mathematical formula used to calculate the coordinate
200    /// value and its derivatives. This affects how the constraint forces
201    /// are computed during optimization.
202    pub coord_type: CoordinateType,
203
204    /// Vector of atom indices (0-based) defining the coordinate.
205    ///
206    /// The length and interpretation depend on the coordinate type:
207    /// - **Bond**: `[atom1, atom2]` - 2 atoms
208    /// - **Angle**: `[atom1, atom2, atom3]` - 3 atoms (atom2 is vertex)
209    /// - **Dihedral**: `[atom1, atom2, atom3, atom4]` - 4 atoms (rotation around bond 2-3)
210    pub atoms: Vec<usize>,
211
212    /// The target value for the coordinate.
213    ///
214    /// Units depend on coordinate type:
215    /// - **Bond**: Angstroms (Angstrom)
216    /// - **Angle**: Radians (use `.to_radians()` to convert from degrees)
217    /// - **Dihedral**: Radians (range -π to π)
218    pub target_value: f64,
219}
220
221impl DriveCoordinate {
222    /// Creates a new `DriveCoordinate` instance.
223    ///
224    /// # Arguments
225    ///
226    /// * `coord_type` - The type of geometric coordinate (Bond, Angle, Dihedral).
227    /// * `atoms` - A vector of atom indices (0-based) defining the coordinate.
228    /// * `target_value` - The target value for the coordinate.
229    ///
230    /// # Returns
231    ///
232    /// A new `DriveCoordinate` instance.
233    pub fn new(coord_type: CoordinateType, atoms: Vec<usize>, target_value: f64) -> Self {
234        Self {
235            coord_type,
236            atoms,
237            target_value,
238        }
239    }
240
241    /// Calculates the current value of this coordinate in the given geometry.
242    ///
243    /// This method evaluates the coordinate using the current atomic positions
244    /// and returns the actual measured value. It's essential for monitoring
245    /// the progress of coordinate driving and checking convergence.
246    ///
247    /// # Arguments
248    ///
249    /// * `geometry` - The molecular geometry to evaluate the coordinate in
250    ///
251    /// # Returns
252    ///
253    /// The current value of the coordinate in appropriate units:
254    /// - Bond: distance in Angstroms
255    /// - Angle: angle in radians
256    /// - Dihedral: torsion angle in radians (-π to π)
257    ///
258    /// # Algorithm
259    ///
260    /// 1. Convert the coordinate specification to a constraint
261    /// 2. Use the constraint evaluation system to calculate the current value
262    /// 3. Return the measured value (constraint violation represents deviation from target)
263    ///
264    /// # Examples
265    ///
266    /// ```rust
267    /// use omecp::reaction_path::{DriveCoordinate, CoordinateType};
268    ///
269    /// let bond_coord = DriveCoordinate::new(
270    ///     CoordinateType::Bond,
271    ///     vec![0, 1],
272    ///     1.5  // Target value (not used in current_value)
273    /// );
274    ///
275    /// // let current_distance = bond_coord.current_value(&geometry);
276    /// // println!("Current C-H distance: {:.3} Angstrom", current_distance);
277    /// ```
278    pub fn current_value(&self, geometry: &Geometry) -> f64 {
279        let constraint = self.to_constraint();
280        let violations = evaluate_constraints(geometry, &[constraint]);
281        violations[0]
282    }
283
284    /// Creates a constraint object for this coordinate using the stored target value.
285    ///
286    /// This method converts the coordinate specification into a constraint that
287    /// can be used by the optimization system. The constraint represents the
288    /// mathematical relationship that should be satisfied.
289    ///
290    /// # Returns
291    ///
292    /// A `Constraint` enum variant appropriate for the coordinate type, with
293    /// the target value set to `self.target_value`.
294    ///
295    /// # Implementation Details
296    ///
297    /// The method maps coordinate types to constraint types:
298    /// - `CoordinateType::Bond` → `Constraint::Bond`
299    /// - `CoordinateType::Angle` → `Constraint::Angle`
300    /// - `CoordinateType::Dihedral` → `Constraint::Dihedral`
301    ///
302    /// Each constraint contains the atom indices and target value needed
303    /// for constraint evaluation and gradient calculation.
304    ///
305    /// # Examples
306    ///
307    /// ```rust
308    /// let coord = DriveCoordinate::new(
309    ///     CoordinateType::Bond,
310    ///     vec![0, 1],
311    ///     1.8
312    /// );
313    ///
314    /// let constraint = coord.to_constraint();
315    /// // This constraint can now be used in optimization
316    /// ```
317    pub fn to_constraint(&self) -> Constraint {
318        match self.coord_type {
319            CoordinateType::Bond => Constraint::Bond {
320                atoms: (self.atoms[0], self.atoms[1]),
321                target: self.target_value,
322            },
323            CoordinateType::Angle => Constraint::Angle {
324                atoms: (self.atoms[0], self.atoms[1], self.atoms[2]),
325                target: self.target_value,
326            },
327            CoordinateType::Dihedral => Constraint::Dihedral {
328                atoms: (self.atoms[0], self.atoms[1], self.atoms[2], self.atoms[3]),
329                target: self.target_value,
330            },
331        }
332    }
333
334    /// Creates a constraint object for this coordinate with a custom target value.
335    ///
336    /// This method is similar to `to_constraint()` but allows specifying a different
337    /// target value than the one stored in the coordinate. This is particularly
338    /// useful during coordinate driving where the target value changes at each step.
339    ///
340    /// # Arguments
341    ///
342    /// * `target_value` - The desired target value for the constraint
343    ///
344    /// # Returns
345    ///
346    /// A `Constraint` enum variant with the specified target value.
347    ///
348    /// # Use Cases
349    ///
350    /// - **Coordinate Driving**: Generate constraints for intermediate target values
351    /// - **Path Optimization**: Create constraints for specific path points
352    /// - **Scanning**: Systematically vary the target value for PES exploration
353    ///
354    /// # Examples
355    ///
356    /// ```rust
357    /// let coord = DriveCoordinate::new(
358    ///     CoordinateType::Bond,
359    ///     vec![0, 1],
360    ///     2.0  // Original target
361    /// );
362    ///
363    /// // Create constraint for intermediate value during driving
364    /// let intermediate_constraint = coord.to_constraint_with_value(1.5);
365    /// ```
366    pub fn to_constraint_with_value(&self, target_value: f64) -> Constraint {
367        match self.coord_type {
368            CoordinateType::Bond => Constraint::Bond {
369                atoms: (self.atoms[0], self.atoms[1]),
370                target: target_value,
371            },
372            CoordinateType::Angle => Constraint::Angle {
373                atoms: (self.atoms[0], self.atoms[1], self.atoms[2]),
374                target: target_value,
375            },
376            CoordinateType::Dihedral => Constraint::Dihedral {
377                atoms: (self.atoms[0], self.atoms[1], self.atoms[2], self.atoms[3]),
378                target: target_value,
379            },
380        }
381    }
382}
383
384/// Generates a series of geometries by systematically driving a coordinate from start to end value.
385///
386/// This function implements the coordinate driving method, which is fundamental in
387/// computational chemistry for exploring reaction paths and potential energy surfaces.
388/// It creates a series of molecular geometries where a chosen reaction coordinate
389/// is systematically varied while all other degrees of freedom are optimized.
390///
391/// # Theoretical Background
392///
393/// Coordinate driving solves a series of constrained optimization problems:
394/// ```text
395/// For each target value t_i:
396///   minimize E(x) subject to g(x) = t_i
397/// ```
398///
399/// where:
400/// - E(x) is the molecular energy
401/// - g(x) is the reaction coordinate function
402/// - t_i are intermediate target values from start_value to end_value
403/// - x represents all atomic coordinates
404///
405/// # Algorithm Overview
406///
407/// 1. **Discretization**: Divide the coordinate range into `num_steps` equal intervals
408/// 2. **Sequential Optimization**: For each target value:
409///    - Create a constraint with the current target value
410///    - Perform constrained optimization starting from the previous geometry
411///    - Store the optimized geometry
412/// 3. **Path Construction**: Return the series of optimized geometries
413///
414/// # Mathematical Details
415///
416/// The step size is calculated as:
417/// ```text
418/// Δt = (end_value - start_value) / (num_steps - 1)
419/// ```
420///
421/// Target values are:
422/// ```text
423/// t_i = start_value + i × Δt  for i = 0, 1, ..., num_steps-1
424/// ```
425///
426/// # Arguments
427///
428/// * `initial_geom` - Starting molecular geometry (reactant structure)
429/// * `drive_coord` - Specification of the coordinate to drive (bond, angle, or dihedral)
430/// * `start_value` - Initial value of the reaction coordinate
431/// * `end_value` - Final value of the reaction coordinate  
432/// * `num_steps` - Number of intermediate points to generate (including endpoints)
433///
434/// # Returns
435///
436/// Returns a `Vec<Geometry>` containing `num_steps` optimized geometries representing
437/// the reaction path. The first geometry corresponds to `start_value` and the last
438/// to `end_value`.
439///
440/// # Applications
441///
442/// - **Reaction Path Generation**: Create initial guess for transition state searches
443/// - **Barrier Height Estimation**: Calculate approximate activation energies
444/// - **Mechanism Elucidation**: Understand how molecular structure changes during reaction
445/// - **Conformational Analysis**: Explore conformational transitions
446///
447/// # Examples
448///
449/// ```rust
450/// use omecp::reaction_path::{drive_coordinate, DriveCoordinate, CoordinateType};
451///
452/// // Drive a C-H bond from 1.1 Angstrom to 3.0 Angstrom in 20 steps
453/// let bond_coord = DriveCoordinate::new(
454///     CoordinateType::Bond,
455///     vec![0, 1],  // Carbon-Hydrogen bond
456///     3.0          // Final target (not used in this function)
457/// );
458///
459/// let path = drive_coordinate(
460///     &initial_geometry,
461///     &bond_coord,
462///     1.1,    // Start: typical C-H bond length
463///     3.0,    // End: dissociated state
464///     20      // 20 intermediate geometries
465/// );
466///
467/// println!("Generated {} geometries along the dissociation path", path.len());
468/// ```
469///
470/// # Performance Considerations
471///
472/// - **Convergence**: Each constrained optimization may require 10-50 iterations
473/// - **Step Size**: Smaller steps (more points) give smoother paths but cost more
474/// - **Starting Guess**: Each optimization starts from the previous optimized geometry
475/// - **Constraint Stiffness**: Tight constraints may require more iterations
476///
477/// # Limitations
478///
479/// - **Single Coordinate**: Only one coordinate can be driven at a time
480/// - **Local Minima**: May get trapped in local minima along the path
481/// - **Constraint Satisfaction**: Assumes constraints can be satisfied at all points
482/// - **No Energy Information**: Does not consider energy barriers during driving
483pub fn drive_coordinate(
484    initial_geom: &Geometry,
485    drive_coord: &DriveCoordinate,
486    start_value: f64,
487    end_value: f64,
488    num_steps: usize,
489) -> Vec<Geometry> {
490    let mut geometries = Vec::new();
491    let step_size = (end_value - start_value) / (num_steps - 1) as f64;
492
493    // Start with initial geometry
494    let mut current_geom = initial_geom.clone();
495    geometries.push(current_geom.clone());
496
497    for i in 1..num_steps {
498        let target_value = start_value + step_size * i as f64;
499        println!(
500            "Driving coordinate to {:.3} (step {}/{})",
501            target_value,
502            i + 1,
503            num_steps
504        );
505
506        // Create constraint for the target coordinate value
507        let constraint = drive_coord.to_constraint_with_value(target_value);
508
509        // Perform constrained optimization to reach the target coordinate
510        current_geom = constrained_coordinate_driving(
511            &current_geom,
512            &[constraint],
513            50,   // max iterations for coordinate driving
514            1e-4, // convergence threshold
515        );
516
517        geometries.push(current_geom.clone());
518    }
519
520    geometries
521}
522
523/// Performs constrained optimization to drive a coordinate to its target value.
524///
525/// This function implements a simplified constrained optimization algorithm that
526/// adjusts atomic coordinates to satisfy geometric constraints while minimizing
527/// constraint violations. It uses a steepest descent approach with constraint forces.
528///
529/// # Theoretical Background
530///
531/// The method solves the constrained optimization problem:
532/// ```text
533/// minimize ||g(x) - target||²
534/// ```
535/// where g(x) represents the constraint functions and target are the desired values.
536///
537/// # Algorithm Details
538///
539/// ## Force Calculation
540/// For each constraint, the algorithm computes:
541/// 1. **Violation**: Current deviation from target value
542/// 2. **Constraint Force**: Gradient of the constraint with respect to coordinates
543/// 3. **Force Direction**: Points toward satisfying the constraint
544///
545/// ## Update Step
546/// The coordinates are updated using:
547/// ```text
548/// x_new = x_old + α × F_constraint
549/// ```
550/// where α is the step size and F_constraint is the total constraint force.
551///
552/// ## Convergence Criteria
553/// The optimization converges when:
554/// ```text
555/// Σ|g_i(x) - target_i| < threshold
556/// ```
557///
558/// # Arguments
559///
560/// * `initial_geom` - Starting geometry for the optimization
561/// * `constraints` - Array of geometric constraints to satisfy
562/// * `max_iterations` - Maximum number of optimization steps (typically 50-100)
563/// * `convergence_threshold` - Tolerance for constraint satisfaction (typically 1e-4)
564///
565/// # Returns
566///
567/// Returns the optimized `Geometry` with constraints satisfied within the threshold.
568///
569/// # Implementation Notes
570///
571/// ## Simplified Approach
572/// This implementation uses a basic steepest descent method rather than more
573/// sophisticated techniques like:
574/// - Lagrange multiplier methods
575/// - Sequential quadratic programming (SQP)
576/// - Augmented Lagrangian methods
577///
578/// ## Force Calculation Methods
579/// - **Bond constraints**: Analytical gradient along bond vector
580/// - **Angle/Dihedral constraints**: Finite difference approximation
581///
582/// ## Step Size Control
583/// - Fixed step size (0.01) - could be improved with adaptive methods
584/// - No line search or trust region methods
585///
586/// # Limitations
587///
588/// - **Convergence**: May be slow for difficult constraints
589/// - **Stability**: No guarantee of convergence for all systems
590/// - **Accuracy**: Finite difference gradients are approximate
591/// - **Efficiency**: Not optimized for computational performance
592///
593/// # Future Improvements
594///
595/// - Implement analytical gradients for all constraint types
596/// - Add adaptive step size control
597/// - Include second-order optimization methods
598/// - Add support for inequality constraints
599fn constrained_coordinate_driving(
600    initial_geom: &Geometry,
601    constraints: &[Constraint],
602    max_iterations: usize,
603    convergence_threshold: f64,
604) -> Geometry {
605    let mut geometry = initial_geom.clone();
606
607    for iteration in 0..max_iterations {
608        // Calculate constraint violations and forces
609        let mut constraint_forces = vec![0.0; geometry.coords.len()];
610        let mut total_violation = 0.0;
611
612        for constraint in constraints {
613            let violation = calculate_constraint_violation(&geometry, constraint);
614            total_violation += violation.abs();
615
616            // Calculate constraint force (simplified - would need proper gradients)
617            let force = calculate_constraint_force(&geometry, constraint, violation);
618            for i in 0..constraint_forces.len() {
619                constraint_forces[i] += force[i];
620            }
621        }
622
623        // Simple steepest descent step
624        let step_size = 0.01;
625        for (coord, &force) in geometry.coords.iter_mut().zip(&constraint_forces) {
626            *coord += step_size * force;
627        }
628
629        // Check convergence
630        if total_violation < convergence_threshold {
631            println!(
632                "Coordinate driving converged after {} iterations",
633                iteration + 1
634            );
635            break;
636        }
637
638        if iteration % 10 == 0 {
639            println!(
640                "Coordinate driving iteration {}: violation = {:.6}",
641                iteration + 1,
642                total_violation
643            );
644        }
645    }
646
647    geometry
648}
649
650/// Calculates the constraint violation for a single geometric constraint.
651///
652/// This function evaluates how much the current geometry deviates from the
653/// target constraint value. The violation is the difference between the
654/// current coordinate value and the desired target value.
655///
656/// # Mathematical Definition
657///
658/// For a constraint g(x) = target, the violation is:
659/// ```text
660/// violation = g(x) - target
661/// ```
662///
663/// # Arguments
664///
665/// * `geometry` - Current molecular geometry
666/// * `constraint` - The geometric constraint to evaluate
667///
668/// # Returns
669///
670/// The constraint violation as a signed value:
671/// - **Positive**: Current value exceeds target
672/// - **Negative**: Current value is below target  
673/// - **Zero**: Constraint is perfectly satisfied
674///
675/// # Examples
676///
677/// For a bond constraint with target 1.5 Angstrom:
678/// - Current bond = 1.8 Angstrom → violation = +0.3 Angstrom (too long)
679/// - Current bond = 1.2 Angstrom → violation = -0.3 Angstrom (too short)
680/// - Current bond = 1.5 Angstrom → violation = 0.0 Angstrom (satisfied)
681fn calculate_constraint_violation(geometry: &Geometry, constraint: &Constraint) -> f64 {
682    let violations = evaluate_constraints(geometry, std::slice::from_ref(constraint));
683    violations[0]
684}
685
686/// Calculates the constraint force (gradient) for a single geometric constraint.
687///
688/// This function computes the force that should be applied to atomic coordinates
689/// to reduce the constraint violation. The force points in the direction that
690/// will move the coordinate value toward its target.
691///
692/// # Theoretical Background
693///
694/// The constraint force is the negative gradient of the constraint violation:
695/// ```text
696/// F_constraint = -∇[g(x) - target] = -∇g(x)
697/// ```
698///
699/// For constraint satisfaction, we want to minimize |g(x) - target|², so:
700/// ```text
701/// F = -2 × (g(x) - target) × ∇g(x)
702/// ```
703///
704/// # Algorithm Implementation
705///
706/// ## Bond Constraints (Analytical)
707/// For bond constraints, the gradient is computed analytically:
708/// ```text
709/// ∇r₁₂ = (r₂ - r₁) / |r₂ - r₁|  (unit vector along bond)
710/// ```
711/// Forces are applied equally and oppositely to the two atoms.
712///
713/// ## Angle/Dihedral Constraints (Finite Difference)
714/// For complex constraints, finite difference approximation is used:
715/// ```text
716/// ∂g/∂xᵢ ≈ [g(x + δeᵢ) - g(x)] / δ
717/// ```
718/// where eᵢ is the unit vector in coordinate direction i.
719///
720/// # Arguments
721///
722/// * `geometry` - Current molecular geometry
723/// * `constraint` - The geometric constraint to compute forces for
724/// * `violation` - Current constraint violation (from `calculate_constraint_violation`)
725///
726/// # Returns
727///
728/// A vector of forces with length 3×N_atoms, where each group of 3 consecutive
729/// elements represents the [Fx, Fy, Fz] force components for one atom.
730///
731/// # Force Scaling
732///
733/// The force magnitude is proportional to the violation:
734/// ```text
735/// |F| = k × |violation|
736/// ```
737/// where k = 10.0 is an empirical force constant.
738///
739/// # Implementation Details
740///
741/// ## Bond Force Calculation
742/// 1. Calculate bond vector and distance
743/// 2. Compute unit vector along bond
744/// 3. Apply force proportional to violation
745/// 4. Equal and opposite forces on the two atoms
746///
747/// ## Finite Difference Parameters
748/// - **Step size (δ)**: 0.001 Angstrom (compromise between accuracy and numerical stability)
749/// - **One-sided difference**: Uses forward difference for simplicity
750/// - **Coordinate perturbation**: Each Cartesian coordinate perturbed independently
751///
752/// # Limitations
753///
754/// - **Finite difference errors**: Approximate gradients for angles/dihedrals
755/// - **Fixed step size**: Not adaptive to constraint type or system size
756/// - **No second derivatives**: Cannot account for constraint curvature
757/// - **Proportional control**: Simple linear relationship between violation and force
758fn calculate_constraint_force(
759    geometry: &Geometry,
760    constraint: &Constraint,
761    violation: f64,
762) -> Vec<f64> {
763    let mut force = vec![0.0; geometry.coords.len()];
764    let force_magnitude = -violation * 10.0; // Simple proportional control
765
766    match constraint {
767        Constraint::Bond {
768            atoms: (a1, a2), ..
769        } => {
770            // Force along the bond vector
771            let pos1 = geometry.get_atom_coords(*a1);
772            let pos2 = geometry.get_atom_coords(*a2);
773            let dist = calculate_distance(geometry, *a1, *a2);
774
775            if dist > 0.0 {
776                let dx = (pos2[0] - pos1[0]) / dist;
777                let dy = (pos2[1] - pos1[1]) / dist;
778                let dz = (pos2[2] - pos1[2]) / dist;
779
780                // Apply force to move atom2
781                let idx2 = a2 * 3;
782                force[idx2] += force_magnitude * dx;
783                force[idx2 + 1] += force_magnitude * dy;
784                force[idx2 + 2] += force_magnitude * dz;
785
786                // Apply opposite force to atom1
787                let idx1 = a1 * 3;
788                force[idx1] -= force_magnitude * dx;
789                force[idx1 + 1] -= force_magnitude * dy;
790                force[idx1 + 2] -= force_magnitude * dz;
791            }
792        }
793        Constraint::Angle { .. } => {
794            // For angles, use finite difference approximation
795            // This is simplified - a real implementation would use analytical gradients
796            let delta = 0.001;
797            for (i, _) in geometry.coords.iter().enumerate() {
798                let mut geom_plus = geometry.clone();
799                geom_plus.coords[i] += delta;
800                let violation_plus = calculate_constraint_violation(&geom_plus, constraint);
801
802                force[i] = -(violation_plus - violation) / delta;
803            }
804        }
805        Constraint::Dihedral { .. } => {
806            // For dihedrals, use finite difference approximation
807            // This is simplified - a real implementation would use analytical gradients
808            let delta = 0.001;
809            for (i, _) in geometry.coords.iter().enumerate() {
810                let mut geom_plus = geometry.clone();
811                geom_plus.coords[i] += delta;
812                let violation_plus = calculate_constraint_violation(&geom_plus, constraint);
813
814                force[i] = -(violation_plus - violation) / delta;
815            }
816        }
817    }
818
819    force
820}
821
822/// Optimizes a reaction path using the Nudged Elastic Band (NEB) method.
823///
824/// The NEB method is a powerful technique for finding minimum energy paths (MEPs)
825/// between known reactant and product structures. It optimizes a chain of intermediate
826/// images (geometries) connected by springs to find the most probable reaction pathway.
827///
828/// # Theoretical Foundation
829///
830/// ## The NEB Method
831/// NEB balances two competing forces on each intermediate image:
832///
833/// 1. **Spring Forces (F^spring)**: Keep images evenly distributed along the path
834/// 2. **True Forces (F^true)**: Drive images toward lower energy regions
835///
836/// The total NEB force on image i is:
837/// ```text
838/// F_i^NEB = F_i^spring + F_i^⊥
839/// ```
840///
841/// where F_i^⊥ is the component of the true force perpendicular to the path.
842///
843/// ## Spring Force Calculation
844/// The spring force along the path tangent τ̂ is:
845/// ```text
846/// F_i^spring = k[(|R_{i+1} - R_i| - |R_i - R_{i-1}|)] τ̂_i
847/// ```
848///
849/// where:
850/// - k is the spring constant (typically 0.1-1.0 eV/Angstrom²)
851/// - R_i represents the coordinates of image i
852/// - τ̂_i is the normalized tangent vector at image i
853///
854/// ## True Force Projection
855/// The perpendicular component of the true force is:
856/// ```text
857/// F_i^⊥ = F_i^true - (F_i^true · τ̂_i) τ̂_i
858/// ```
859///
860/// This ensures forces don't interfere with spring-controlled spacing.
861///
862/// # Algorithm Overview
863///
864/// 1. **Initialization**: Start with initial path (linear interpolation or guess)
865/// 2. **Force Calculation**: For each intermediate image:
866///    - Calculate tangent vector from neighboring images
867///    - Compute spring forces along the tangent
868///    - Project true forces perpendicular to path
869/// 3. **Geometry Update**: Move each image according to total NEB force
870/// 4. **Convergence Check**: Continue until forces are below threshold
871/// 5. **Constraint Application**: Apply any geometric constraints if specified
872///
873/// # Arguments
874///
875/// * `initial_path` - Starting path as a series of geometries (reactant → product)
876/// * `constraints` - Optional geometric constraints to maintain during optimization
877///
878/// # Returns
879///
880/// Returns an optimized `Vec<Geometry>` representing the minimum energy path.
881/// The first and last geometries (endpoints) remain fixed during optimization.
882///
883/// # Implementation Details
884///
885/// ## Current Implementation (Simplified)
886/// This implementation uses a simplified NEB approach:
887/// - **Spring forces only**: True energy gradients not included
888/// - **Fixed endpoints**: Reactant and product geometries unchanged
889/// - **Steepest descent**: Simple optimization without advanced methods
890/// - **Basic constraints**: Limited constraint handling
891///
892/// ## Parameters
893/// - **Spring constant**: k = 0.1 (relatively soft springs)
894/// - **Step size**: α = 0.01 (conservative for stability)
895/// - **Max iterations**: 100 (sufficient for most systems)
896/// - **Convergence threshold**: 1e-3 (moderate precision)
897///
898/// # Applications
899///
900/// - **Transition State Location**: Find saddle points along reaction paths
901/// - **Reaction Mechanism Studies**: Understand detailed reaction pathways
902/// - **Activation Energy Calculation**: Determine energy barriers
903/// - **Conformational Transitions**: Study large-scale molecular motions
904///
905/// # Examples
906///
907/// ```rust
908/// use omecp::reaction_path::optimize_reaction_path;
909///
910/// // Create initial path by linear interpolation
911/// let initial_path = vec![reactant_geom, intermediate_geom, product_geom];
912///
913/// // Optimize the path using NEB
914/// let optimized_path = optimize_reaction_path(&initial_path, &[]);
915///
916/// println!("Optimized path has {} images", optimized_path.len());
917/// ```
918///
919/// # Advanced NEB Variants (Not Implemented)
920///
921/// - **Climbing Image NEB (CI-NEB)**: Drives highest energy image to saddle point
922/// - **Adaptive NEB**: Automatically adjusts number of images
923/// - **String Method**: Alternative path optimization approach
924/// - **Growing String Method**: Dynamically extends path length
925///
926/// # References
927///
928/// - Henkelman, G.; Jónsson, H. *J. Chem. Phys.* **2000**, 113, 9978-9985.
929/// - Henkelman, G.; Uberuaga, B. P.; Jónsson, H. *J. Chem. Phys.* **2000**, 113, 9901-9904.
930pub fn optimize_reaction_path(
931    initial_path: &[Geometry],
932    constraints: &[Constraint],
933) -> Vec<Geometry> {
934    if initial_path.len() < 3 {
935        return initial_path.to_vec();
936    }
937
938    let mut path = initial_path.to_vec();
939    let max_iterations = 100;
940    let convergence_threshold = 1e-3;
941    let spring_constant = 0.1; // Spring constant for NEB
942
943    for iteration in 0..max_iterations {
944        let mut max_force = 0.0;
945        let mut new_path = Vec::new();
946
947        for (i, geometry) in path.iter().enumerate() {
948            if i == 0 || i == path.len() - 1 {
949                // Endpoints are fixed
950                new_path.push(geometry.clone());
951                continue;
952            }
953
954            // Calculate forces for this image
955            let force = calculate_neb_force(&path, i, spring_constant);
956
957            // Update geometry (simple steepest descent for now)
958            let step_size = 0.01;
959            let mut new_coords = geometry.coords.clone();
960
961            for j in 0..new_coords.len() {
962                new_coords[j] += step_size * force[j];
963            }
964
965            let mut new_geometry = Geometry::new(geometry.elements.clone(), new_coords.data.into());
966
967            // Apply constraints if any
968            if !constraints.is_empty() {
969                // This would need proper constrained optimization
970                // For now, just apply the constraints directly
971                for constraint in constraints {
972                    if let Constraint::Bond {
973                        atoms: (a1, a2),
974                        target,
975                    } = constraint
976                    {
977                        // Simple bond constraint application
978                        let current_dist = calculate_distance(&new_geometry, *a1, *a2);
979                        if (current_dist - *target).abs() > 0.01 {
980                            // Adjust bond length (simplified)
981                            adjust_bond_length(&mut new_geometry, *a1, *a2, *target);
982                        }
983                    }
984                }
985            }
986
987            new_path.push(new_geometry);
988
989            // Track maximum force for convergence
990            let force_magnitude = force.iter().map(|x| x * x).sum::<f64>().sqrt();
991            if force_magnitude > max_force {
992                max_force = force_magnitude;
993            }
994        }
995
996        path = new_path;
997
998        // Check convergence
999        if max_force < convergence_threshold {
1000            println!("NEB converged after {} iterations", iteration + 1);
1001            break;
1002        }
1003
1004        if iteration % 10 == 0 {
1005            println!(
1006                "NEB iteration {}: max force = {:.6}",
1007                iteration + 1,
1008                max_force
1009            );
1010        }
1011    }
1012
1013    path
1014}
1015
1016/// Calculates the NEB force for a specific image in the reaction path.
1017///
1018/// This function computes the spring forces that maintain proper spacing between
1019/// images in the NEB method. It implements the core force calculation that drives
1020/// the path optimization toward the minimum energy pathway.
1021///
1022/// # Theoretical Background
1023///
1024/// The NEB force on image i consists of spring forces along the path tangent:
1025/// ```text
1026/// F_i^spring = k × [(d_{i+1} - d_avg) - (d_i - d_avg)] × τ̂_i
1027/// ```
1028///
1029/// where:
1030/// - k is the spring constant
1031/// - d_i is the distance from image i to image i-1
1032/// - d_avg is the average spacing between all images
1033/// - τ̂_i is the normalized tangent vector at image i
1034///
1035/// # Algorithm Steps
1036///
1037/// 1. **Neighbor Identification**: Get previous and next images
1038/// 2. **Tangent Calculation**: Compute path tangent at current image
1039/// 3. **Distance Measurement**: Calculate distances to neighboring images
1040/// 4. **Spring Force**: Apply Hooke's law along the tangent direction
1041/// 5. **Force Balancing**: Ensure forces maintain even spacing
1042///
1043/// # Arguments
1044///
1045/// * `path` - Complete reaction path as array of geometries
1046/// * `image_index` - Index of the current image (0 = first, n-1 = last)
1047/// * `spring_constant` - Spring stiffness parameter (typically 0.1-1.0)
1048///
1049/// # Returns
1050///
1051/// Vector of forces with length 3×N_atoms, representing [Fx, Fy, Fz] components
1052/// for each atom in the molecule.
1053///
1054/// # Force Direction Logic
1055///
1056/// ## Interior Images (i = 1, 2, ..., n-2)
1057/// - **Previous spring**: Pulls toward image i-1 if too far apart
1058/// - **Next spring**: Pulls toward image i+1 if too far apart
1059/// - **Net effect**: Maintains even spacing along path
1060///
1061/// ## Endpoint Images (i = 0, n-1)
1062/// - **Fixed positions**: No forces applied (endpoints don't move)
1063/// - **Boundary conditions**: Provide reference for neighboring images
1064///
1065/// # Spring Constant Effects
1066///
1067/// - **High k (stiff springs)**: Images stay evenly spaced but may resist optimization
1068/// - **Low k (soft springs)**: Images can cluster but may lose path resolution
1069/// - **Typical values**: k = 0.1-0.5 eV/Angstrom² for most chemical systems
1070///
1071/// # Implementation Notes
1072///
1073/// ## Current Limitations
1074/// - **Spring forces only**: True energy gradients not included
1075/// - **Simple spacing**: Uses RMSD rather than arc length
1076/// - **No adaptive springs**: Constant spring constant throughout
1077///
1078/// ## Distance Metrics
1079/// - **RMSD**: Root mean square deviation between geometries
1080/// - **Average spacing**: Mean distance between consecutive images
1081/// - **Force scaling**: Proportional to deviation from average spacing
1082fn calculate_neb_force(path: &[Geometry], image_index: usize, spring_constant: f64) -> Vec<f64> {
1083    let n_images = path.len();
1084    let current = &path[image_index];
1085
1086    // Get neighboring images
1087    let prev = if image_index > 0 {
1088        Some(&path[image_index - 1])
1089    } else {
1090        None
1091    };
1092    let next = if image_index < n_images - 1 {
1093        Some(&path[image_index + 1])
1094    } else {
1095        None
1096    };
1097
1098    // Calculate tangent vector
1099    let tangent = calculate_tangent(prev, current, next);
1100
1101    // Calculate spring forces
1102    let mut spring_force = vec![0.0; current.coords.len()];
1103
1104    if let Some(prev_geom) = prev {
1105        let dist_prev = calculate_rmsd(current, prev_geom);
1106        let spring_force_magnitude = spring_constant * (dist_prev - get_average_spacing(path));
1107        for i in 0..spring_force.len() {
1108            spring_force[i] += spring_force_magnitude * tangent[i];
1109        }
1110    }
1111
1112    if let Some(next_geom) = next {
1113        let dist_next = calculate_rmsd(current, next_geom);
1114        let spring_force_magnitude = spring_constant * (dist_next - get_average_spacing(path));
1115        for i in 0..spring_force.len() {
1116            spring_force[i] -= spring_force_magnitude * tangent[i]; // Opposite direction
1117        }
1118    }
1119
1120    // For now, return spring force only (true NEB would include potential energy gradients)
1121    // In a full implementation, this would be:
1122    // F = F_spring + (F_perp - F_perp·τ * τ) where F_perp is the perpendicular component of the energy gradient
1123
1124    spring_force
1125}
1126
1127/// Calculates the tangent vector for NEB path optimization.
1128///
1129/// The tangent vector defines the local direction of the reaction path at each
1130/// image. It's crucial for the NEB method because it determines how spring forces
1131/// are applied and ensures that true forces don't interfere with path spacing.
1132///
1133/// # Theoretical Background
1134///
1135/// The tangent vector τ̂_i at image i represents the local path direction:
1136/// ```text
1137/// τ̂_i = (path direction at image i) / |path direction|
1138/// ```
1139///
1140/// Different algorithms exist for computing tangents:
1141/// - **Simple tangent**: τ = R_{i+1} - R_{i-1}
1142/// - **Bisector tangent**: τ = (R_{i+1} - R_i) + (R_i - R_{i-1})
1143/// - **Improved tangent**: Energy-weighted combination (not implemented here)
1144///
1145/// # Algorithm Implementation
1146///
1147/// ## Interior Images (both neighbors exist)
1148/// Uses the bisector method:
1149/// ```text
1150/// v_plus = (R_{i+1} - R_i) / |R_{i+1} - R_i|
1151/// v_minus = (R_i - R_{i-1}) / |R_i - R_{i-1}|
1152/// τ = v_plus + v_minus
1153/// τ̂ = τ / |τ|
1154/// ```
1155///
1156/// ## Boundary Images (one neighbor missing)
1157/// Uses simple difference:
1158/// ```text
1159/// τ = R_neighbor - R_current
1160/// τ̂ = τ / |τ|
1161/// ```
1162///
1163/// # Arguments
1164///
1165/// * `prev` - Previous geometry in path (None for first image)
1166/// * `current` - Current geometry for which to calculate tangent
1167/// * `next` - Next geometry in path (None for last image)
1168///
1169/// # Returns
1170///
1171/// Normalized tangent vector with length 3×N_atoms, representing the path
1172/// direction in Cartesian coordinate space.
1173///
1174/// # Tangent Vector Properties
1175///
1176/// ## Normalization
1177/// The tangent is always normalized: |τ̂| = 1
1178/// This ensures consistent force scaling regardless of path curvature.
1179///
1180/// ## Smoothness
1181/// The bisector method provides smoother tangents than simple differences,
1182/// reducing oscillations in the optimization.
1183///
1184/// ## Boundary Handling
1185/// Special treatment for endpoints prevents undefined tangents and
1186/// maintains proper force directions.
1187///
1188/// # Mathematical Details
1189///
1190/// ## Vector Operations
1191/// - **Subtraction**: Component-wise difference between geometries
1192/// - **Normalization**: Division by Euclidean norm
1193/// - **Addition**: Component-wise sum for bisector calculation
1194///
1195/// ## Coordinate Representation
1196/// The tangent vector contains [dx₁, dy₁, dz₁, dx₂, dy₂, dz₂, ...] where
1197/// (dxᵢ, dyᵢ, dzᵢ) represents the path direction for atom i.
1198///
1199/// # Applications in NEB
1200///
1201/// - **Spring force direction**: Springs act along τ̂
1202/// - **Force projection**: True forces projected perpendicular to τ̂
1203/// - **Path characterization**: Tangent describes local path geometry
1204fn calculate_tangent(
1205    prev: Option<&Geometry>,
1206    current: &Geometry,
1207    next: Option<&Geometry>,
1208) -> Vec<f64> {
1209    let mut tangent = vec![0.0; current.coords.len()];
1210
1211    match (prev, next) {
1212        (Some(p), Some(n)) => {
1213            // Both neighbors exist - use central difference
1214            let v_plus = subtract_geometries(n, current);
1215            let v_minus = subtract_geometries(current, p);
1216
1217            // Normalize vectors
1218            let norm_plus = v_plus.iter().map(|x| x * x).sum::<f64>().sqrt();
1219            let norm_minus = v_minus.iter().map(|x| x * x).sum::<f64>().sqrt();
1220
1221            if norm_plus > 0.0 && norm_minus > 0.0 {
1222                let v_plus_norm: Vec<f64> = v_plus.iter().map(|x| x / norm_plus).collect();
1223                let v_minus_norm: Vec<f64> = v_minus.iter().map(|x| x / norm_minus).collect();
1224
1225                for i in 0..tangent.len() {
1226                    tangent[i] = v_plus_norm[i] + v_minus_norm[i];
1227                }
1228            }
1229        }
1230        (Some(p), None) => {
1231            // Only previous neighbor - use forward difference
1232            let v = subtract_geometries(current, p);
1233            let norm = v.iter().map(|x| x * x).sum::<f64>().sqrt();
1234            if norm > 0.0 {
1235                for i in 0..tangent.len() {
1236                    tangent[i] = v[i] / norm;
1237                }
1238            }
1239        }
1240        (None, Some(n)) => {
1241            // Only next neighbor - use backward difference
1242            let v = subtract_geometries(n, current);
1243            let norm = v.iter().map(|x| x * x).sum::<f64>().sqrt();
1244            if norm > 0.0 {
1245                for i in 0..tangent.len() {
1246                    tangent[i] = v[i] / norm;
1247                }
1248            }
1249        }
1250        _ => {}
1251    }
1252
1253    // Normalize tangent
1254    let norm = tangent.iter().map(|x| x * x).sum::<f64>().sqrt();
1255    if norm > 0.0 {
1256        for val in &mut tangent {
1257            *val /= norm;
1258        }
1259    }
1260
1261    tangent
1262}
1263
1264/// Calculates the Root Mean Square Deviation (RMSD) between two molecular geometries.
1265///
1266/// RMSD is a standard metric for measuring the similarity between molecular structures.
1267/// It quantifies the average distance between corresponding atoms in two geometries,
1268/// providing a single number that represents structural similarity.
1269///
1270/// # Mathematical Definition
1271///
1272/// RMSD is calculated as:
1273/// ```text
1274/// RMSD = √[(1/N) × Σᵢ(rᵢ₁ - rᵢ₂)²]
1275/// ```
1276///
1277/// where:
1278/// - N is the total number of coordinate components (3 × number of atoms)
1279/// - rᵢ₁, rᵢ₂ are corresponding coordinate components in the two geometries
1280/// - The sum is over all x, y, z coordinates of all atoms
1281///
1282/// # Arguments
1283///
1284/// * `geom1` - First molecular geometry
1285/// * `geom2` - Second molecular geometry (must have same number of atoms)
1286///
1287/// # Returns
1288///
1289/// RMSD value in the same units as the input coordinates (typically Angstroms).
1290///
1291/// # Interpretation
1292///
1293/// - **RMSD ≈ 0**: Geometries are nearly identical
1294/// - **RMSD < 0.1 Angstrom**: Very similar structures (small conformational changes)
1295/// - **RMSD 0.1-0.5 Angstrom**: Moderate structural differences
1296/// - **RMSD > 0.5 Angstrom**: Significant structural changes
1297///
1298/// # Applications in Path Methods
1299///
1300/// - **Image spacing**: Measure distances between path images
1301/// - **Convergence criteria**: Check if optimization has converged
1302/// - **Path length calculation**: Sum of RMSD values along path
1303/// - **Quality assessment**: Evaluate path smoothness
1304fn calculate_rmsd(geom1: &Geometry, geom2: &Geometry) -> f64 {
1305    let mut sum_sq = 0.0;
1306    for i in 0..geom1.coords.len() {
1307        let diff = geom1.coords[i] - geom2.coords[i];
1308        sum_sq += diff * diff;
1309    }
1310    (sum_sq / geom1.coords.len() as f64).sqrt()
1311}
1312
1313/// Subtracts two geometries coordinate-wise to compute displacement vectors.
1314///
1315/// This function performs element-wise subtraction of coordinate arrays to
1316/// compute the displacement vector between two molecular geometries. The
1317/// result represents the direction and magnitude of atomic movements.
1318///
1319/// # Mathematical Operation
1320///
1321/// For geometries with coordinates [x₁, y₁, z₁, x₂, y₂, z₂, ...]:
1322/// ```text
1323/// displacement = geom1 - geom2 = [x₁¹-x₁², y₁¹-y₁², z₁¹-z₁², ...]
1324/// ```
1325///
1326/// # Arguments
1327///
1328/// * `geom1` - First geometry (minuend)
1329/// * `geom2` - Second geometry (subtrahend)
1330///
1331/// # Returns
1332///
1333/// Vector of coordinate differences with length 3×N_atoms, where each group
1334/// of 3 elements represents [Δx, Δy, Δz] for one atom.
1335///
1336/// # Applications
1337///
1338/// - **Tangent vectors**: Compute path directions in NEB
1339/// - **Displacement analysis**: Understand atomic movements
1340/// - **Force calculations**: Determine direction of coordinate changes
1341/// - **Path characterization**: Analyze reaction coordinate changes
1342///
1343/// # Vector Interpretation
1344///
1345/// The resulting vector points from geom2 toward geom1:
1346/// - **Positive components**: Atom moved in positive coordinate direction
1347/// - **Negative components**: Atom moved in negative coordinate direction
1348/// - **Zero components**: No movement in that coordinate direction
1349fn subtract_geometries(geom1: &Geometry, geom2: &Geometry) -> Vec<f64> {
1350    geom1
1351        .coords
1352        .iter()
1353        .zip(geom2.coords.iter())
1354        .map(|(a, b)| a - b)
1355        .collect()
1356}
1357
1358/// Calculates the average spacing between consecutive images in a reaction path.
1359///
1360/// This function computes the mean distance between neighboring geometries
1361/// along a reaction path. It's used in NEB to determine the reference spacing
1362/// for spring force calculations and to assess path quality.
1363///
1364/// # Mathematical Definition
1365///
1366/// Average spacing is calculated as:
1367/// ```text
1368/// d_avg = (1/(n-1)) × Σᵢ₌₁ⁿ⁻¹ RMSD(Rᵢ₊₁, Rᵢ)
1369/// ```
1370///
1371/// where:
1372/// - n is the number of images in the path
1373/// - RMSD(Rᵢ₊₁, Rᵢ) is the distance between consecutive images
1374/// - The sum is over all adjacent image pairs
1375///
1376/// # Arguments
1377///
1378/// * `path` - Array of geometries representing the reaction path
1379///
1380/// # Returns
1381///
1382/// Average spacing in the same units as coordinates (typically Angstroms).
1383/// Returns 0.0 for paths with fewer than 2 images.
1384///
1385/// # Applications in NEB
1386///
1387/// ## Spring Force Reference
1388/// The average spacing serves as the equilibrium length for NEB springs:
1389/// ```text
1390/// F_spring ∝ (d_actual - d_avg)
1391/// ```
1392///
1393/// ## Path Quality Assessment
1394/// - **Uniform spacing**: d_avg ≈ individual spacings (good path)
1395/// - **Non-uniform spacing**: Large variation in spacings (needs optimization)
1396///
1397/// ## Convergence Monitoring
1398/// - **Stable d_avg**: Path spacing has converged
1399/// - **Changing d_avg**: Path still optimizing
1400///
1401/// # Ideal Path Properties
1402///
1403/// - **Even spacing**: All consecutive distances ≈ d_avg
1404/// - **Smooth progression**: No large jumps between adjacent images
1405/// - **Appropriate resolution**: d_avg small enough to capture path features
1406fn get_average_spacing(path: &[Geometry]) -> f64 {
1407    if path.len() < 2 {
1408        return 0.0;
1409    }
1410
1411    let mut total_distance = 0.0;
1412    for i in 1..path.len() {
1413        total_distance += calculate_rmsd(&path[i], &path[i - 1]);
1414    }
1415
1416    total_distance / (path.len() - 1) as f64
1417}
1418
1419/// Calculates the Euclidean distance between two atoms in a molecular geometry.
1420///
1421/// This function computes the bond length between two specified atoms using
1422/// the standard Euclidean distance formula in 3D Cartesian coordinates.
1423/// It's a fundamental operation in molecular geometry analysis.
1424///
1425/// # Mathematical Formula
1426///
1427/// The distance between atoms i and j is:
1428/// ```text
1429/// d_ij = √[(x_j - x_i)² + (y_j - y_i)² + (z_j - z_i)²]
1430/// ```
1431///
1432/// where (x_i, y_i, z_i) and (x_j, y_j, z_j) are the Cartesian coordinates
1433/// of atoms i and j, respectively.
1434///
1435/// # Arguments
1436///
1437/// * `geometry` - The molecular geometry containing atomic coordinates
1438/// * `atom1` - Index of the first atom (0-based)
1439/// * `atom2` - Index of the second atom (0-based)
1440///
1441/// # Returns
1442///
1443/// Distance between the two atoms in the same units as the input coordinates
1444/// (typically Angstroms).
1445///
1446/// # Applications
1447///
1448/// - **Bond length analysis**: Measure actual bond distances
1449/// - **Constraint evaluation**: Check if bond constraints are satisfied
1450/// - **Geometry validation**: Ensure reasonable molecular structure
1451/// - **Force calculations**: Compute bond-based constraint forces
1452///
1453/// # Typical Bond Lengths (for reference)
1454///
1455/// - **C-H**: ~1.1 Angstrom
1456/// - **C-C**: ~1.5 Angstrom (single), ~1.3 Angstrom (double), ~1.2 Angstrom (triple)
1457/// - **C-O**: ~1.4 Angstrom (single), ~1.2 Angstrom (double)
1458/// - **O-H**: ~1.0 Angstrom
1459/// - **N-H**: ~1.0 Angstrom
1460///
1461/// # Examples
1462///
1463/// ```rust
1464/// // Calculate C-H bond length
1465/// let ch_distance = calculate_distance(&geometry, 0, 1);
1466/// println!("C-H bond length: {:.3} Angstrom", ch_distance);
1467///
1468/// // Check if bond is within reasonable range
1469/// if ch_distance > 2.0 {
1470///     println!("Warning: Unusually long C-H bond!");
1471/// }
1472/// ```
1473fn calculate_distance(geometry: &Geometry, atom1: usize, atom2: usize) -> f64 {
1474    let pos1 = geometry.get_atom_coords(atom1);
1475    let pos2 = geometry.get_atom_coords(atom2);
1476    let dx = pos1[0] - pos2[0];
1477    let dy = pos1[1] - pos2[1];
1478    let dz = pos1[2] - pos2[2];
1479    (dx * dx + dy * dy + dz * dz).sqrt()
1480}
1481
1482/// Adjusts the bond length between two atoms to a target value.
1483///
1484/// This function modifies the molecular geometry by moving one atom to achieve
1485/// a specific bond length. It's a simplified constraint satisfaction method
1486/// used in path optimization when geometric constraints must be maintained.
1487///
1488/// # Algorithm Description
1489///
1490/// The method uses a scaling approach:
1491/// 1. **Current distance**: Calculate existing bond length
1492/// 2. **Scaling factor**: Compute ratio of target to current length
1493/// 3. **Position adjustment**: Move atom2 along the bond vector
1494/// 4. **Coordinate update**: Apply the new position to the geometry
1495///
1496/// # Mathematical Implementation
1497///
1498/// The new position of atom2 is calculated as:
1499/// ```text
1500/// r₂_new = r₁ + (target_length / current_length) × (r₂ - r₁)
1501/// ```
1502///
1503/// where:
1504/// - r₁, r₂ are the current positions of atoms 1 and 2
1505/// - target_length is the desired bond length
1506/// - current_length is the existing bond length
1507///
1508/// # Arguments
1509///
1510/// * `geometry` - Mutable reference to the molecular geometry
1511/// * `atom1` - Index of the first atom (remains fixed)
1512/// * `atom2` - Index of the second atom (will be moved)
1513/// * `target_length` - Desired bond length in same units as coordinates
1514///
1515/// # Behavior Details
1516///
1517/// ## Fixed Atom Choice
1518/// - **Atom1**: Remains at its original position (anchor point)
1519/// - **Atom2**: Moved along the bond vector to achieve target distance
1520///
1521/// ## Direction Preservation
1522/// - The bond direction (unit vector) remains unchanged
1523/// - Only the magnitude (bond length) is modified
1524/// - No rotation or angular changes occur
1525///
1526/// ## Edge Cases
1527/// - **Zero distance**: Function returns early to avoid division by zero
1528/// - **Negative target**: Would invert bond direction (not physically meaningful)
1529///
1530/// # Applications
1531///
1532/// - **Constraint enforcement**: Maintain specific bond lengths during optimization
1533/// - **Geometry correction**: Fix unreasonable bond distances
1534/// - **Path optimization**: Apply bond constraints in NEB calculations
1535/// - **Structure preparation**: Set up initial geometries with desired bond lengths
1536///
1537/// # Limitations
1538///
1539/// ## Simplified Approach
1540/// - **Single bond focus**: Only considers the specified bond
1541/// - **No connectivity**: Ignores effects on other bonds and angles
1542/// - **No energy consideration**: Purely geometric adjustment
1543/// - **Fixed anchor**: Always keeps atom1 stationary
1544///
1545/// ## Potential Issues
1546/// - **Steric clashes**: May create unrealistic atomic overlaps
1547/// - **Angle distortion**: Can distort bond angles involving these atoms
1548/// - **Chain effects**: Changes may propagate through molecular structure
1549///
1550/// # Future Improvements
1551///
1552/// - **Mass-weighted adjustment**: Move both atoms based on their masses
1553/// - **Connectivity awareness**: Consider effects on neighboring bonds
1554/// - **Energy minimization**: Combine with local energy optimization
1555/// - **Multiple constraint handling**: Simultaneously satisfy several constraints
1556fn adjust_bond_length(geometry: &mut Geometry, atom1: usize, atom2: usize, target_length: f64) {
1557    let current_length = calculate_distance(geometry, atom1, atom2);
1558    if current_length == 0.0 {
1559        return;
1560    }
1561
1562    let scale = target_length / current_length;
1563    let pos1 = geometry.get_atom_coords(atom1);
1564    let pos2 = geometry.get_atom_coords(atom2);
1565
1566    // Move atom2 towards/away from atom1
1567    let new_pos2 = [
1568        pos1[0] + (pos2[0] - pos1[0]) * scale,
1569        pos1[1] + (pos2[1] - pos1[1]) * scale,
1570        pos1[2] + (pos2[2] - pos1[2]) * scale,
1571    ];
1572
1573    // Update coordinates
1574    let base_idx = atom2 * 3;
1575    geometry.coords[base_idx] = new_pos2[0];
1576    geometry.coords[base_idx + 1] = new_pos2[1];
1577    geometry.coords[base_idx + 2] = new_pos2[2];
1578}
1579
1580/// Analyzes a reaction path and computes comprehensive statistical information.
1581///
1582/// This function performs detailed analysis of a reaction path, calculating
1583/// various metrics that characterize the path quality, length, and properties.
1584/// It's essential for understanding reaction mechanisms and validating path
1585/// optimization results.
1586///
1587/// # Analysis Components
1588///
1589/// ## Path Length Calculation
1590/// Computes the total arc length along the reaction path:
1591/// ```text
1592/// L_total = Σᵢ₌₁ⁿ⁻¹ RMSD(Rᵢ₊₁, Rᵢ)
1593/// ```
1594///
1595/// This represents the total "distance" traveled in configuration space
1596/// from reactant to product.
1597///
1598/// ## Geometric Metrics
1599/// - **Number of images**: Total path resolution
1600/// - **Average spacing**: Mean distance between consecutive images
1601/// - **Path smoothness**: Variation in inter-image distances
1602///
1603/// ## Energy Analysis (Future Extension)
1604/// The framework supports energy analysis:
1605/// - **Barrier heights**: Maximum energy along path
1606/// - **Reaction energy**: Energy difference between endpoints
1607/// - **Energy profile**: Complete energy vs. reaction coordinate
1608///
1609/// # Arguments
1610///
1611/// * `geometries` - Array of molecular geometries representing the reaction path
1612///
1613/// # Returns
1614///
1615/// Returns a `PathStatistics` struct containing:
1616/// - `path_length`: Total path length in coordinate space
1617/// - `num_points`: Number of geometries in the path
1618/// - `energies`: Energy values (currently empty, for future use)
1619/// - `coordinates`: Reaction coordinate values (currently empty, for future use)
1620///
1621/// # Path Length Interpretation
1622///
1623/// ## Physical Meaning
1624/// The path length represents the "effort" required to transform the reactant
1625/// into the product through the specific pathway:
1626/// - **Short paths**: Direct, efficient transformations
1627/// - **Long paths**: Complex, multi-step mechanisms
1628/// - **Very long paths**: Potentially unphysical or poorly optimized
1629///
1630/// ## Typical Values
1631/// - **Simple reactions**: 1-5 Angstrom total path length
1632/// - **Complex rearrangements**: 5-20 Angstrom total path length
1633/// - **Conformational changes**: 2-10 Angstrom total path length
1634///
1635/// # Applications
1636///
1637/// ## Path Quality Assessment
1638/// - **Resolution check**: Ensure adequate number of images
1639/// - **Smoothness evaluation**: Identify poorly optimized regions
1640/// - **Convergence monitoring**: Track path changes during optimization
1641///
1642/// ## Mechanism Analysis
1643/// - **Pathway comparison**: Compare different reaction routes
1644/// - **Bottleneck identification**: Find regions requiring more images
1645/// - **Coordinate selection**: Choose appropriate reaction coordinates
1646///
1647/// ## Method Validation
1648/// - **Algorithm comparison**: Evaluate different path optimization methods
1649/// - **Parameter tuning**: Optimize NEB spring constants and convergence criteria
1650/// - **Benchmark studies**: Compare with experimental or high-level theoretical results
1651///
1652/// # Examples
1653///
1654/// ```rust
1655/// use omecp::reaction_path::analyze_reaction_path;
1656///
1657/// // Analyze an optimized reaction path
1658/// let stats = analyze_reaction_path(&optimized_path);
1659///
1660/// println!("Path analysis:");
1661/// println!("  Total length: {:.3} Angstrom", stats.path_length);
1662/// println!("  Number of images: {}", stats.num_points);
1663/// println!("  Average spacing: {:.3} Angstrom",
1664///          stats.path_length / (stats.num_points - 1) as f64);
1665/// ```
1666///
1667/// # Future Extensions
1668///
1669/// ## Energy Integration
1670/// When energy data becomes available:
1671/// ```rust
1672/// // Future capability
1673/// if !stats.energies.is_empty() {
1674///     let barrier_height = stats.energies.iter().fold(0.0, |a, &b| a.max(b));
1675///     println!("Activation barrier: {:.2} kcal/mol", barrier_height * 627.5);
1676/// }
1677/// ```
1678///
1679/// ## Advanced Metrics
1680/// - **Path curvature**: Measure of path deviation from straight line
1681/// - **Intrinsic reaction coordinate**: Arc length parameterization
1682/// - **Turning points**: Identify regions of high curvature
1683/// - **Bottleneck analysis**: Locate transition state regions
1684pub fn analyze_reaction_path(geometries: &[Geometry]) -> PathStatistics {
1685    let energies = Vec::new(); // Would be filled from QM calculations
1686    let coordinates = Vec::new();
1687
1688    // Calculate path length
1689    let mut path_length = 0.0;
1690    for i in 1..geometries.len() {
1691        let coords1 = geometry_to_coords(&geometries[i - 1]);
1692        let coords2 = geometry_to_coords(&geometries[i]);
1693
1694        let mut segment_length = 0.0;
1695        for j in 0..coords1.len() {
1696            let diff = coords1[j] - coords2[j];
1697            segment_length += diff * diff;
1698        }
1699        path_length += segment_length.sqrt();
1700    }
1701
1702    PathStatistics {
1703        path_length,
1704        num_points: geometries.len(),
1705        energies,
1706        coordinates,
1707    }
1708}
1709
1710/// Statistics for a reaction path
1711/// Statistics and data collected along a reaction path.
1712///
1713/// This struct holds various metrics and data points generated during a
1714/// reaction path optimization or coordinate driving procedure, such as
1715/// path length, number of points, and (potentially) energies and other
1716/// relevant coordinates.
1717#[derive(Debug, Clone)]
1718pub struct PathStatistics {
1719    /// The total length of the reaction path.
1720    pub path_length: f64,
1721    /// The number of discrete points (geometries) along the path.
1722    pub num_points: usize,
1723    /// A vector of energies corresponding to each point on the path.
1724    pub energies: Vec<f64>,
1725    /// A vector of reaction coordinate values corresponding to each point on the path.
1726    pub coordinates: Vec<f64>,
1727}
1728
1729/// Converts a molecular geometry to a flat coordinate vector.
1730///
1731/// This utility function extracts all atomic coordinates from a geometry object
1732/// and arranges them in a single flat vector. This representation is convenient
1733/// for mathematical operations, distance calculations, and vector arithmetic
1734/// used throughout reaction path methods.
1735///
1736/// # Data Layout
1737///
1738/// The output vector contains coordinates in the following order:
1739/// ```text
1740/// [x₁, y₁, z₁, x₂, y₂, z₂, ..., xₙ, yₙ, zₙ]
1741/// ```
1742///
1743/// where (xᵢ, yᵢ, zᵢ) are the Cartesian coordinates of atom i.
1744///
1745/// # Arguments
1746///
1747/// * `geom` - The molecular geometry to convert
1748///
1749/// # Returns
1750///
1751/// A `Vec<f64>` with length 3×N_atoms containing all coordinate components
1752/// in sequential order.
1753///
1754/// # Applications
1755///
1756/// ## Mathematical Operations
1757/// - **Vector arithmetic**: Addition, subtraction, scaling of entire geometries
1758/// - **Distance calculations**: RMSD and path length computations
1759/// - **Linear algebra**: Matrix operations on coordinate sets
1760///
1761/// ## Path Analysis
1762/// - **Interpolation**: Generate intermediate geometries along paths
1763/// - **Tangent vectors**: Compute path directions for NEB
1764/// - **Displacement analysis**: Study atomic movements during reactions
1765///
1766/// ## Optimization Algorithms
1767/// - **Gradient calculations**: Flat vectors for optimization routines
1768/// - **Constraint handling**: Vector operations for constraint satisfaction
1769/// - **Force applications**: Apply forces to all atoms simultaneously
1770///
1771/// # Memory Layout
1772///
1773/// The flat representation is memory-efficient and cache-friendly:
1774/// - **Contiguous storage**: All coordinates in a single array
1775/// - **Predictable access**: Sequential memory access patterns
1776/// - **SIMD compatibility**: Suitable for vectorized operations
1777///
1778/// # Examples
1779///
1780/// ```rust
1781/// // Convert geometry for mathematical operations
1782/// let coords = geometry_to_coords(&geometry);
1783/// println!("Total coordinates: {}", coords.len());
1784/// println!("Number of atoms: {}", coords.len() / 3);
1785///
1786/// // Access specific atom coordinates
1787/// let atom_index = 2;  // Third atom (0-based)
1788/// let x = coords[atom_index * 3];
1789/// let y = coords[atom_index * 3 + 1];
1790/// let z = coords[atom_index * 3 + 2];
1791/// println!("Atom {} position: ({:.3}, {:.3}, {:.3})", atom_index, x, y, z);
1792/// ```
1793///
1794/// # Coordinate System
1795///
1796/// - **Units**: Same as input geometry (typically Angstroms)
1797/// - **Origin**: Arbitrary (depends on input geometry)
1798/// - **Axes**: Standard Cartesian (x, y, z)
1799/// - **Handedness**: Right-handed coordinate system assumed
1800fn geometry_to_coords(geom: &Geometry) -> Vec<f64> {
1801    let mut coords = Vec::new();
1802    for i in 0..geom.num_atoms {
1803        let atom_coords = geom.get_atom_coords(i);
1804        coords.push(atom_coords[0]);
1805        coords.push(atom_coords[1]);
1806        coords.push(atom_coords[2]);
1807    }
1808    coords
1809}
omecp/reaction_path.rs

omecp/
reaction_path.rs