Conversation
This change introduces different structure for `Std.StatePreparation.PreparePureStateD` and underlying helper functions, which reduces the compile time by offloading more of the classical compute into static functions run on the full evaluator rather than handled inside the partial-evaluation layer.
|
Change in memory usage detected by benchmark. Memory Report for a5fb96c
|
fedimser
left a comment
There was a problem hiding this comment.
I see 2 improvements: 1) Moving recursion from quantum operation to classical function. 2) Using doubles instead of complex.
My guess is that most performance gain comes from (1).
Did you evaluate how much gain comes from (2) and if it's any significant? If it's not significant, it might be better to leave it out.
Alternatively, I'd split it into 2 PRs for 2 optimizations.
| let abs0 = AbsD(coefficients[idxCoeff]); | ||
| let abs1 = AbsD(coefficients[idxCoeff + 1]); | ||
| let arg0 = coefficients[idxCoeff] < 0.0 ? PI() | 0.0; | ||
| let arg1 = coefficients[idxCoeff + 1] < 0.0 ? PI() | 0.0; | ||
| let r = Sqrt(abs0 * abs0 + abs1 * abs1); | ||
| let t = 0.5 * (arg0 + arg1); | ||
| let phi = arg1 - arg0; | ||
| let theta = 2.0 * ArcTan2(abs1, abs0); | ||
| (ComplexPolar(r, t), phi, theta) |
There was a problem hiding this comment.
I think this can be factored into a function BlochSphereCoordinatesD taking 2 Doubles.
| let coefficientsAsComplexPolar = Mapped(a -> ComplexAsComplexPolar(Complex(a, 0.0)), coefficients); | ||
| ApproximatelyPreparePureStateCP(0.0, coefficientsAsComplexPolar, qubits); | ||
| let nQubits = Length(qubits); | ||
| // pad coefficients at tail length to a power of 2. |
There was a problem hiding this comment.
Do we have a Q# styleguide and what does it say about comments?
Generally, when I write comments, I write them as sentences, starting with capital letter and ending with a period. I'd suggest following this pattern.
| // Since we know the coefficients are real, we can optimize the first round of adjoint approximate unpreparation by directly | ||
| // computing the disentangling angles and the new coefficients on those doubles without producing intermediate complex numbers. | ||
|
|
||
| // For each 2D block, compute disentangling single-qubit rotation parameters |
There was a problem hiding this comment.
What does "2D" stand for? My first thought without context is "2-dimensional", but it's probably something else.
| } | ||
| } | ||
|
|
||
| // Provides the sequence of angles or entangling CNOTs to apply for the multiplex Z step of the state preparation procedure, given a set of coefficients and control and target qubits. |
There was a problem hiding this comment.
I think it's worth to call out that there is convention on how to interpret output:
- Every element has 1 or 2 qubits.
- If it's one qubit, it's a rotation.
- If it's 2 qubits, it's CNOT, and angle is ignored.
|
|
||
| // Provides the sequence of angles or entangling CNOTs to apply for the multiplex Z step of the state preparation procedure, given a set of coefficients and control and target qubits. | ||
| function GenerateMultiplexZParams( | ||
| tolerance : Double, |
There was a problem hiding this comment.
Tolerance is not really used.
I'd suggest either not passing it or passing it and use in "termination case" to not emit the rotation gate.
| controlled adjoint (controlRegister, ...) { | ||
| // pad coefficients length to a power of 2. | ||
| let coefficientsPadded = Padded(2^(Length(control) + 1), 0.0, Padded(-2^Length(control), 0.0, coefficients)); | ||
| let (coefficients0, coefficients1) = MultiplexZCoefficients(coefficientsPadded); | ||
| if AnyOutsideToleranceD(tolerance, coefficients1) { | ||
| within { | ||
| Controlled X(controlRegister, target); | ||
| } apply { | ||
| Adjoint ApproximatelyMultiplexZ(tolerance, coefficients1, control, target); | ||
| } | ||
| } | ||
| Adjoint ApproximatelyMultiplexZ(tolerance, coefficients0, control, target); | ||
| } |
There was a problem hiding this comment.
I am not sure why we need explicit adjoin control and it can't be auto-generated.
| body ... { | ||
| // We separately compute the operation sequence for the multiplex Z steps in a function, which | ||
| // provides a performance improvement during partial-evaluation for code generation. | ||
| let multiplexZParams = GenerateMultiplexZParams(tolerance, coefficients, control, target); | ||
| for (angle, qs) in multiplexZParams { | ||
| if Length(qs) == 2 { | ||
| CNOT(qs[0], qs[1]); | ||
| } elif AbsD(angle) > tolerance { | ||
| Exp([PauliZ], angle, qs); | ||
| } | ||
| } else { | ||
| // Compute new coefficients. | ||
| let (coefficients0, coefficients1) = MultiplexZCoefficients(coefficientsPadded); | ||
| ApproximatelyMultiplexZ(tolerance, coefficients0, Most(control), target); | ||
| if AnyOutsideToleranceD(tolerance, coefficients1) { | ||
| within { | ||
| CNOT(Tail(control), target); | ||
| } apply { | ||
| ApproximatelyMultiplexZ(tolerance, coefficients1, Most(control), target); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| adjoint ... { | ||
| // We separately compute the operation sequence for the adjoint multiplex Z steps in a function, which | ||
| // provides a performance improvement during partial-evaluation for code generation. | ||
| let adjMultiplexZParams = GenerateAdjMultiplexZParams(tolerance, coefficients, control, target); | ||
| for (angle, qs) in adjMultiplexZParams { | ||
| if Length(qs) == 2 { | ||
| CNOT(qs[0], qs[1]); | ||
| } elif AbsD(angle) > tolerance { | ||
| Exp([PauliZ], -angle, qs); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I think we can move checking tolerance and negating angle for adjoint case into classical functions.
Then you can have exactly the same quantum part that can be factored into separate operation to reduce code duplication.
operation ApproximatelyMultiplexZ(tolerance, coefficients, control, target) {
body ... {
ApplyRzCnots(GenerateMultiplexZParams(tolerance, coefficients, control, target));
}
adjoint ... {
ApplyRzCnots(GenerateAdjMultiplexZParams(tolerance, coefficients, control, target));
}
}
and then:
operation ApplyRzCnots(params: (Double, Qubit[])[]) {
for (angle, qs) in adjMultiplexZParams {
if Length(qs) == 2 {
CNOT(qs[0], qs[1]);
} else {
Exp([PauliZ], angle, qs);
}
}
}
| // We separately compute the operation sequence for the multiplex Z steps in a function, which | ||
| // provides a performance improvement during partial-evaluation for code generation. |
There was a problem hiding this comment.
I think it's too much information for the reader especially if they don't know what partial eval is which is implementation detail of compiler.
I'd just call out that recursion in quantum operation is more expensive that in classical.
Or just don't have any comment here, it's obvious that we separate quantum and classical computation,
This change introduces different structure for
Std.StatePreparation.PreparePureStateDand underlying helper functions, which reduces the compile time by offloading more of the classical compute into static functions run on the full evaluator rather than handled inside the partial-evaluation layer.In local testing, this improved QIR compilation of a large state preparation on 16 qubits from 14 seconds to 4 seconds.