Skip navigation
Skip navigation

Relative loss bounds for multidimensional regression problems

Kivinen, Jyrki; Warmuth, Manfred

Description

We study on-line generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax function that need to consider the linear activations to all the output neurons. The weight vectors used to produce the linear activations are represented indirectly by maintaining separate parameter vectors. We get the weight vector by applying a particular parameterization function...[Show more]

dc.contributor.authorKivinen, Jyrki
dc.contributor.authorWarmuth, Manfred
dc.date.accessioned2015-12-10T23:35:59Z
dc.identifier.issn0885-6125
dc.identifier.urihttp://hdl.handle.net/1885/69960
dc.description.abstractWe study on-line generalized linear regression with multidimensional outputs, i.e., neural networks with multiple output nodes but no hidden nodes. We allow at the final layer transfer functions such as the softmax function that need to consider the linear activations to all the output neurons. The weight vectors used to produce the linear activations are represented indirectly by maintaining separate parameter vectors. We get the weight vector by applying a particular parameterization function to the parameter vector. Updating the parameter vectors upon seeing new examples is done additively, as in the usual gradient descent update. However, by using a nonlinear parameterization function between the parameter vectors and the weight vectors, we can make the resulting update of the weight vector quite different from a true gradient descent update. To analyse such updates, we define a notion of a matching loss function and apply it both to the transfer function and to the parameterization function. The loss function that matches the transfer function is used to measure the goodness of the predictions of the algorithm. The loss function that matches the parameterization function can be used both as a measure of divergence between models in motivating the update rule of the algorithm and as a measure of progress in analyzing its relative performance compared to an arbitrary fixed model. As a result, we have a unified treatment that generalizes earlier results for the gradient descent and exponentiated gradient algorithms to multidimensional outputs, including multiclass logistic regression.
dc.publisherKluwer Academic Publishers
dc.sourceMachine Learning
dc.subjectKeywords: Algorithms; Eigenvalues and eigenfunctions; Transfer functions; Vectors; Generalized linear regressions; Regression analysis Bregman divergences; Generalized linear regression; On-line prediction; Relative loss bounds
dc.titleRelative loss bounds for multidimensional regression problems
dc.typeJournal article
local.description.notesImported from ARIES
local.description.refereedYes
local.identifier.citationvolume45
dc.date.issued2001
local.identifier.absfor020204 - Plasma Physics; Fusion Plasmas; Electrical Discharges
local.identifier.ariespublicationMigratedxPub2173
local.type.statusPublished Version
local.contributor.affiliationKivinen, Jyrki, College of Engineering and Computer Science, ANU
local.contributor.affiliationWarmuth, Manfred, University of California
local.description.embargo2037-12-31
local.bibliographicCitation.issue3
local.bibliographicCitation.startpage301
local.bibliographicCitation.lastpage329
local.identifier.doi10.1023/A:1017938623079
dc.date.updated2015-12-10T11:47:37Z
local.identifier.scopusID2-s2.0-0035575628
CollectionsANU Research Publications

Download

File Description SizeFormat Image
01_Kivinen_Relative_loss_bounds_for_2001.pdf209.8 kBAdobe PDF    Request a copy


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  17 November 2022/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator