roc(3)

NAME

Statistics::ROC - receiver-operator-characteristic (ROC)
curves with nonparametric confidence bounds

SYNOPSIS

use Statistics::ROC;
my ($y)    = loggamma($x);
my ($y)    = betain($x, $p, $q, $beta);
my ($y)    = Betain($x, $p, $q);
my ($y)    = xinbta($p, $q, $beta, $alpha);
my ($y)    = Xinbta($p, $q, $alpha);
my (@rk)   = rank($type, @r);
my (@ROC)  = roc($model_type,$conf,@val_grp);

DESCRIPTION

This program determines the ROC curve and its
nonparametric confidence bounds for data categorized into
two groups. A ROC curve shows the relationship of
probability of false alarm (x-axis) to probability of detection (y-axis) for a certain test. Expressed in medical terms: the probability of a positive test, given no disease to the probability of a positive test, given disease. The ROC curve may be used to determine an optimal cutoff point for the test.

The main function is roc(). The other exported functions are used by roc(), but might be useful for other nonparametric statistical procedures.

loggamma: This procedure evaluates the natural logarithm of
gamma(x) for all x>0, accurate to 10 decimal places.
Stirlings formula is used for the central polynomial
part of the procedure. For x=0 a value of
743.746924740801 will be returned: this is
loggamma(9.9999999999E-324).
betain: Computes incomplete beta function ratio

Remarks:
Complete beta function: B(p,q)=gamma(p)*gam

ma(q)/gamma(p+q)
log(B(p,q))=ln(gam

ma(p))+ln(gamma(q))-ln(gamma(p+q))
Incomplete beta function ratio:: I_x(p,q)=1/B(p,q) * int_0^x; t^{p-1}*(1-t)^{q-1} dt
--> log(B(p,q)) has to be supplied to calculate
I_x(p,q) log denotes the natural logarithm: $beta = log(B(p,q))
$x = x
$p = p
$q = q
The subroutine returns I_x(p,q). If an error oc
curs a negative value {-1,-2} is returned.
Betain: Computes the incomplete beta function by calling
loggamma() and betain().
xinbta: Computes inverse of incomplete beta function ratio

Remarks:

Complete beta function: B(p,q)=gamma(p)*gam

ma(q)/gamma(p+q)
log(B(p,q))=ln(gam

ma(p))+ln(gamma(q))-ln(gamma(p+q))
Incomplete beta function ratio:: alpha = I_x(p,q) = 1/B(p,q) * int_0^x; t^{p-1}*(1-t)^{q-1} dt
--> log(B(p,q)) has to be supplied to calculate
I_x(p,q) log denotes the natural logarithm: $beta = log(B(p,q))
$alpha= I_x(p,q)
$p = p
$q = q
The subroutine returns x. If an error occurs a
negative value {-1,-2,-3} is returned.
Xinbta: Computes the inverse of the incomplete beta function
by calling loggamma() and xinbta().
rank: Computes the ranks of the values specified as the
second argument (an array). Returns a vector of ranks
corresponding to the input vector. Different types of
ranking are possible ('high', 'low', 'mean'), and are
specified as first argument. These differ in the way
ties of the input vector, i.e. identical values, are
treated:
� high: replace ranks of identical values with their: highest rank
� low: replace ranks of identical values with their: lowest rank
� mean: replace ranks of identical values with the: mean of their ranks
roc Determines the ROC curve and its nonparametric: confidence bounds. The ROC curve shows the
relationship of "probability of false alarm" (x-axis)
to "probability of detection" (y-axis) for a certain
test. Or in medical terms: the "probability of a
positive test, given no disease" to the "probability
of a positive test, given disease". The ROC curve may
be used to determine an "optimal" cutoff point for the
test.; The routine takes three arguments:; (1) type of model: 'decrease' or 'increase', this
states the assumption that a higher ('increase') value
of the data tends to be an indicator of a positive
test result or for the model 'decrease' a lower value.; (2) two-sided confidence interval (usually 0.95 is
chosen).; (3) the data stored as a list-of-lists: each entry in
this list consits of an "value / true group" pair,
i.e. value / disease present. Group values are from
{0,1}. 0 stands for disease (or signal) not present
(prior knowledge) and 1 for disease (or signal)
present (prior knowledge). Example: @s=([2, 0],
[12.5, 1], [3, 0], [10, 1], [9.5, 0], [9, 1]); Notice
the small overlap of the groups. The optimal cutoff
point to separate the two groups would be between 9
and 9.5 if the criterion of optimality is to maximize
the probability of detection and simultaneously
minimize the probability of false alarm.; Returns a list-of-lists with the three curves:
@ROC=([@lower_b], [@roc], [@upper_b]) each of; the curves is
again a list-of-lists with each entry consisting; of one (x,y) pair.
Examples: $,=" ";
print loggamma(10), "0;
print Xinbta(3,4,Betain(.6,3,4)),"0;; @e=(0.7, 0.7, 0.9, 0.6, 1.0, 1.1, 1,.7,.6);
print rank('low',@e),"0;
print rank('high',@e),"0;
print rank('mean',@e),"0;; @var_grp=([1.5,0],[1.4,0],[1.4,0],[1.3,0],[1.2,0],[1,0],[0.8,0],
[1.1,1],[1,1],[1,1],[0.9,1],[0.7,1],[0.7,1],[0.6,1]);; @curves=roc('decrease',0.95,@var_grp);
print "$curves[0][2][0] $curves[0][2][1] 0;

AUTHOR

Hans A. Kestler, hans.kestler@medizin.uni-ulm.de or h.kestler@ieee.org

docs.sk

comprehensive documentation repository

Most Viewed

roc(3)

NAME

SYNOPSIS

DESCRIPTION

AUTHOR

SEE ALSO