Next: The circuits for integer
Up: Analysis of Bernstein's Factorization
Previous: Introduction
Subsections
Background on the number field sieve
In theory and in practice the two main steps of the NFS
are the relation collection step
and the matrix step. We review their heuristic asymptotic runtime
analysis because it enables us to stress several points
that are important for a proper understanding of ``standard-NFS'' and
of ``circuit-NFS'' as proposed in [1].
2.1. Smoothness.
An integer is
called
-smooth if all its prime factors are at most
.
Following [10, 3.16]
we denote by
any function of
that equals
for
where
and
are real numbers with
and
logarithms are natural. Thus,
,
,
if
,
and if
then
for any fixed
, and
where
is the number of primes
.
Let
,
,
, and
be fixed real numbers with
. A random
positive integer
is
-smooth with
probability
for
We abbreviate
to
and
to
. Thus,
a random integer
is
-smooth with
probability
. The notation
in [1] corresponds to
here.
We write ``
'' for ``
for
.''
2.2. Ordinary NFS.
To factor
using the NFS, more or less following the approach from [11], one
selects a positive integer
for a positive value
that is yet to be determined, an integer
close to
, a polynomial
such that
with each
of the same order of
magnitude
as
, a rational smoothness bound
, and an algebraic smoothness
bound
. Other properties of
these parameters are not relevant for our purposes.
A pair
of integers is called a relation if
and
are
coprime,
,
is
-smooth, and
is
-smooth.
Each relation corresponds to a sparse
-dimensional bit vector with
prime
(cf. [11]).
In the relation collection step a set of more than
relations
is sought. Given this set, one or more linear dependencies modulo
among
the corresponding
-dimensional bit vectors are constructed
in the matrix step. Per dependency there is a chance of at least 50%
(exactly 50% for RSA moduli) that
a factor of
is found in the final step, the square root step. We
discuss some issues of the relation collection and
matrix steps that are relevant for [1].
2.3. Relation collection.
We restrict the search for relations to the rectangle
,
and use
and
that are both
(which
does not imply that
), for
that
are yet to be determined.
It follows (cf. 2.1) that
. Furthermore,
and
With 2.1 and under
the usual assumption that
and
behave, with respect to
smoothness probabilities, independently as random integers of comparable
sizes, the probability that both are
-smooth is
The search space contains
pairs
and, due to the
, as many pairs
with
.
It follows that
and
must be chosen such that
We find that
2.4. Testing for smoothness.
The
search space can be processed in
operations. If
sufficient memory is available this can be done using sieving.
Current PC implementations intended for the factorization of relatively
small
numbers usually have adequate memory for sieving. For much larger
numbers and current programs, sieving would become problematic. In that
case,
the search space can be processed in the ``same''
operations
(with an, admittedly, larger
) but at a cost of only
memory
using the Elliptic Curve Method (ECM) embellished in any way one sees fit
with trial division, Pollard rho, early aborts, etc., and run on any
number
of processors in parallel to achieve a
-fold speedup. This was
observed many times (see for instance [10, 4.15] and [4]).
Thus, despite the fact that current implementations of the relation collection
require substantial memory, it is well known that asymptotically this step
requires negligible memory without incurring, in theory, a runtime penalty -
in practice, however, it is substantially slower than sieving.
Intermediate solutions that exchange sieving
memory for many tightly coupled processors with small memories could prove
valuable too; see [6] for an early example of this approach
and [1] for various other interesting proposals that may turn out
to be practically relevant. For the asymptotic argument, ECM suffices.
In improved NFS from [4] it was necessary to use a
``memory-free'' method when searching for
-smooth
numbers (cf. 2.2), in order to achieve the speedup.
It was suggested in [4] that the ECM may be used for this purpose.
Since memory usage was no concern for the
analysis in [4], regular ``memory-wasteful'' sieving was
suggested to test
-smoothness.
2.5. The matrix step.
The choices made in 2.3 result in a bit matrix
consisting of
columns such that each column of
contains only
nonzero entries. Denoting by
the
total number of nonzero entries of
(its weight), it
follows that
.
Using a variety of
techniques [5,13], dependencies
can be found after, essentially,
multiplications of
times a
bit vector. Since one matrix-by-vector multiplication can be done in
operations, the matrix step can be completed in
operations. We use ``standard-NFS'' to refer to NFS
that uses a matrix step with
operation count.
We will be concerned with a specific method for finding the
dependencies in
, namely the block Wiedemann algorithm
[5][18] whose outline is as follows. Let
be
the blocking factor, i.e., the amount of parallelism desired. We may
assume that either
or
.
Choose
binary
-dimensional vectors
for
. For
each
, compute the vectors
for
up to roughly
,
using repeated matrix-by-vector multiplication. For each such vector
, compute the inner products
, for all
. Only these inner products are saved, to conserve storage. From
the inner products, compute certain polynomials
,
of degree about
. Then evaluate
,
for all
and
(take one
at a time and evaluate
for all
simultaneously using repeated
matrix-by-vector multiplications).
From the result,
elements from the kernel of
can be
computed. The procedure is probabilistic, but succeeds with high
probability for
[17]. For
, the cost roughly
doubles [18].
For reasonable blocking factors (
or
), the block
Wiedemann algorithm involves about
matrix-by-vector
multiplications. These multiplications dominate the cost of the matrix
step; accordingly, the circuits of [1], and our variants thereof,
aim to reduce their cost. Note that the multiplications are performed
in
separate chains where each chain involves repeated
left-multiplication by
. The proposed circuits rely on this for
their efficiency. Thus, they appear less suitable for other
dependency-finding algorithms, such as block Lanczos [13]
which requires just
multiplications.
2.6. NFS parameter optimization for matrix
exponent
.
With the relation collection and matrix steps in
and
operations, respectively, the values for
,
, and
that
minimize the overall NFS operation count follow using
Relation (1).
However, we also need the optimal values if the ``cost'' of the matrix
step is different from
: in [1] ``cost'' is defined
using a metric that is not always the same as operation count, so we need
to analyse the NFS using alternative cost metrics. This can be done by
allowing flexibility in the ``cost'' of the matrix step: we consider how to
optimize the NFS parameters for an
matrix step, for some exponent
. The corresponding relation collection operation count
is fixed at
(cf. 2.4).
We balance the cost of the relation collection and matrix steps by taking
. With (1) it follows that

Minimizing
given
leads to
and
Minimizing the resulting
leads to
and
: even though
would allow more ``relaxed'' relations (i.e., larger smoothness bounds and
thus easier to find), the fact that more of such relations have to be found
becomes counterproductive. It follows that an operation count of
is optimal for relation collection, but that for
it is better to use suboptimal relation collection because otherwise the
matrix step would dominate. We find the following optimal NFS parameters:
-
:
,
, and
, with operation counts of relation collection and
matrix steps equal to
and
, respectively. For
the operation counts of the two steps are the same (when expressed in
)
and the overall operation count is
.
This corresponds to the heuristic asymptotic runtime of the NFS as given
in [11]. We refer to these parameter choices as the ordinary
parameter choices.
-
:
,
, and
as given by Relations (2),
(4), and (3), respectively, with operation count
for relation collection and cost
for the matrix step, where
. More in
particular, we find the following values.
-
:
,
, and
, for an operation count and cost
for the relation collection and matrix
steps, respectively. These values are
familiar from [1, Section 6: Circuits].
With
and equating
operation count and cost, this suggests that factoring
-bit composites using NFS with matrix
exponent
is comparable to factoring 512-bit ones using standard-NFS
with ordinary parameter choices (disregarding the effects of the
's).
-
:
,
, and
, for
an operation count and cost of
for the
relation collection and matrix steps, respectively.
2.7. Improved NFS.
It was shown
in [4] that ordinary NFS from [11], and as used
in 2.2, can be improved by using more than a single polynomial
.
Let
and
be as in 2.3 and 2.2,
respectively, let
indicate the rational smoothness bound
(i.e.,
), and let
indicate the algebraic smoothness
bound
(i.e.,
). Let
be a set of
different polynomials, each of degree
and
common root
modulo
(as in 2.2). A pair
of
integers is a relation if
and
are coprime,
,
is
-smooth, and
is
-smooth for at least one
.
Let
be the matrix exponent. Balancing the cost of the relation
collection and matrix steps it follows that
.
Optimization leads to

and for this
to

and

It follows that for
the method from [4]
gives an improvement over the ordinary method, namely
. The condition
leads to
, so that for
(as in circuit-NFS,
cf. 3.1) usage of the method from [4] no
longer leads to an improvement over the
ordinary method. This explains why in [1] the method from [4]
is used to select parameters for standard-NFS and why the ordinary method
is used for circuit-NFS.
With 2.1 it follows that the sum of the (rational) sieving
and ECM-based (algebraic) smoothness times from [4] (cf. last
paragraph of 2.4) is minimized if
.
The above formulas then lead to
. Therefore, unlike the ordinary
parameter selection method, optimal relation collection for the improved
method from [4] occurs for an
with
: with
the operation count for relation collection becomes
. Thus, in principle, and
depending on the cost function one is using, the improved method would
be able to take advantage of a matrix step with exponent
.
If we disregard the matrix step and minimize the
operation count of relation collection, this method yields a cost of
.
Next: The circuits for integer
Up: Analysis of Bernstein's Factorization
Previous: Introduction