roc(3)

NAME

Statistics::ROC - receiver-operator-characteristic (ROC)
curves with nonparametric confidence bounds

SYNOPSIS

use Statistics::ROC;
my ($y)    = loggamma($x);
my ($y)    = betain($x, $p, $q, $beta);
my ($y)    = Betain($x, $p, $q);
my ($y)    = xinbta($p, $q, $beta, $alpha);
my ($y)    = Xinbta($p, $q, $alpha);
my (@rk)   = rank($type, @r);
my (@ROC)  = roc($model_type,$conf,@val_grp);

DESCRIPTION

This program determines the ROC curve and its
nonparametric confidence bounds for data categorized into
two groups. A ROC curve shows the relationship of
probability of false alarm (x-axis) to probability of detection (y-axis) for a certain test. Expressed in medical terms: the probability of a positive test, given no disease to the probability of a positive test, given disease. The ROC curve may be used to determine an optimal cutoff point for the test.

The main function is roc(). The other exported functions are used by roc(), but might be useful for other nonparametric statistical procedures.

loggamma
This procedure evaluates the natural logarithm of
gamma(x) for all x>0, accurate to 10 decimal places.
Stirlings formula is used for the central polynomial
part of the procedure. For x=0 a value of
743.746924740801 will be returned: this is
loggamma(9.9999999999E-324).
betain
Computes incomplete beta function ratio

Remarks:
Complete beta function: B(p,q)=gamma(p)*gam
ma(q)/gamma(p+q)
log(B(p,q))=ln(gam
ma(p))+ln(gamma(q))-ln(gamma(p+q))
Incomplete beta function ratio:
I_x(p,q)=1/B(p,q) * int_0^x
t^{p-1}*(1-t)^{q-1} dt
--> log(B(p,q)) has to be supplied to calculate
I_x(p,q)
log denotes the natural logarithm
$beta = log(B(p,q))
$x = x
$p = p
$q = q
The subroutine returns I_x(p,q). If an error oc
curs a negative value
{-1,-2} is returned.
Betain
Computes the incomplete beta function by calling
loggamma() and betain().
xinbta
Computes inverse of incomplete beta function ratio

Remarks:
Complete beta function: B(p,q)=gamma(p)*gam
ma(q)/gamma(p+q)
log(B(p,q))=ln(gam
ma(p))+ln(gamma(q))-ln(gamma(p+q))
Incomplete beta function ratio:
alpha = I_x(p,q) = 1/B(p,q) * int_0^x
t^{p-1}*(1-t)^{q-1} dt
--> log(B(p,q)) has to be supplied to calculate
I_x(p,q)
log denotes the natural logarithm
$beta = log(B(p,q))
$alpha= I_x(p,q)
$p = p
$q = q
The subroutine returns x. If an error occurs a
negative value {-1,-2,-3}
is returned.
Xinbta
Computes the inverse of the incomplete beta function
by calling loggamma() and xinbta().
rank
Computes the ranks of the values specified as the
second argument (an array). Returns a vector of ranks
corresponding to the input vector. Different types of
ranking are possible ('high', 'low', 'mean'), and are
specified as first argument. These differ in the way
ties of the input vector, i.e. identical values, are
treated:
· high: replace ranks of identical values with their
highest rank
· low: replace ranks of identical values with their
lowest rank
· mean: replace ranks of identical values with the
mean of their ranks
roc Determines the ROC curve and its nonparametric
confidence bounds. The ROC curve shows the
relationship of "probability of false alarm" (x-axis)
to "probability of detection" (y-axis) for a certain
test. Or in medical terms: the "probability of a
positive test, given no disease" to the "probability
of a positive test, given disease". The ROC curve may
be used to determine an "optimal" cutoff point for the
test.
The routine takes three arguments:
(1) type of model: 'decrease' or 'increase', this
states the assumption that a higher ('increase') value
of the data tends to be an indicator of a positive
test result or for the model 'decrease' a lower value.
(2) two-sided confidence interval (usually 0.95 is
chosen).
(3) the data stored as a list-of-lists: each entry in
this list consits of an "value / true group" pair,
i.e. value / disease present. Group values are from
{0,1}. 0 stands for disease (or signal) not present
(prior knowledge) and 1 for disease (or signal)
present (prior knowledge). Example: @s=([2, 0],
[12.5, 1], [3, 0], [10, 1], [9.5, 0], [9, 1]); Notice
the small overlap of the groups. The optimal cutoff
point to separate the two groups would be between 9
and 9.5 if the criterion of optimality is to maximize
the probability of detection and simultaneously
minimize the probability of false alarm.
Returns a list-of-lists with the three curves:
@ROC=([@lower_b], [@roc], [@upper_b]) each of
the curves is
again a list-of-lists with each entry consisting
of one (x,y) pair.
Examples

$,=" ";
print loggamma(10), "0;
print Xinbta(3,4,Betain(.6,3,4)),"0;
@e=(0.7, 0.7, 0.9, 0.6, 1.0, 1.1, 1,.7,.6);
print rank('low',@e),"0;
print rank('high',@e),"0;
print rank('mean',@e),"0;
@var_grp=([1.5,0],[1.4,0],[1.4,0],[1.3,0],[1.2,0],[1,0],[0.8,0],
[1.1,1],[1,1],[1,1],[0.9,1],[0.7,1],[0.7,1],[0.6,1]);
@curves=roc('decrease',0.95,@var_grp);
print "$curves[0][2][0] $curves[0][2][1] 0;

AUTHOR

Hans A. Kestler, hans.kestler@medizin.uni-ulm.de or h.kestler@ieee.org

SEE ALSO

Perl/Tk userinterface for drawing ROC curves (to be
uploaded shortly).

R.A. Hilgers, Distribution-Free Confidence Bounds for ROC
Curves (1991), Meth Inform Med, 30:96-101

Algorithm 291, Logarithm of the gamma function. Collected Algorithms of the ACM, Vol II, 1980

Numerical Recipes in C, second edition, by Press, Teukolsky, Vetterling and Flannery, Cambridge University
Press, 1992.

G.W. Cran, K.J. Martin and G.E. Thomas (1977).Remark AS
R19 and Algorithm AS109, A Remark on Algorithms AS 63: The
Incomplete Beta Integral AS 64: Inverse of the Incomplete
Beta Function Ratio, Appl Statist, 26:111-114.

K.J. Berry, P.W. Mielke, Jr and G.W. Cran (1990) Algorithm
AS R83, A Remark on Algorithm AS 109: Inverse of the
Incomplete Beta Function Ratio, Appl Statist, 39:309-310.
Copyright © 2010-2025 Platon Technologies, s.r.o.           Home | Man pages | tLDP | Documents | Utilities | About
Design by styleshout