A Generalized Model of PAC Learning and its Applicability

Thomas Brodag; Steffen Herbold; Stephan Waack

RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications (2014)

  • Volume: 48, Issue: 2, page 209-245
  • ISSN: 0988-3754

Abstract

top
We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135–166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.

How to cite

top

Brodag, Thomas, Herbold, Steffen, and Waack, Stephan. "A Generalized Model of PAC Learning and its Applicability." RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications 48.2 (2014): 209-245. <http://eudml.org/doc/273063>.

@article{Brodag2014,
abstract = {We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135–166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.},
author = {Brodag, Thomas, Herbold, Steffen, Waack, Stephan},
journal = {RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications},
keywords = {PAC learning with classification noise; Mammen−Tsybakov small margin conditions; statistical queries; noise model orthogonal to a set of query functions; bioinformatics; software engineering; Mammen-Tsybakov small margin conditions},
language = {eng},
number = {2},
pages = {209-245},
publisher = {EDP-Sciences},
title = {A Generalized Model of PAC Learning and its Applicability},
url = {http://eudml.org/doc/273063},
volume = {48},
year = {2014},
}

TY - JOUR
AU - Brodag, Thomas
AU - Herbold, Steffen
AU - Waack, Stephan
TI - A Generalized Model of PAC Learning and its Applicability
JO - RAIRO - Theoretical Informatics and Applications - Informatique Théorique et Applications
PY - 2014
PB - EDP-Sciences
VL - 48
IS - 2
SP - 209
EP - 245
AB - We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135–166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.
LA - eng
KW - PAC learning with classification noise; Mammen−Tsybakov small margin conditions; statistical queries; noise model orthogonal to a set of query functions; bioinformatics; software engineering; Mammen-Tsybakov small margin conditions
UR - http://eudml.org/doc/273063
ER -

References

top
  1. [1] D.W. Aha and D. Kibler, Instance-based learning algorithms. Machine Learn. (1991) 37–66. Zbl0709.68044
  2. [2] D. Angluin and P. Laird, Learning from noisy examples. Machine Learn.2 (1988) 343–370. 
  3. [3] http://httpd.apache.org/ (2011). 
  4. [4] J.A. Aslam, Noise Tolerant Algorithms for Learning and Searching, Ph.D. thesis. MIT (1995). 
  5. [5] J.A. Aslam and S.E. Decatur, Specification and Simulation of Statistical Query Algorithms for Efficiency and Noise Tolerance. J. Comput. Syst. Sci.56 (1998) 191–208. Zbl0912.68063MR1629619
  6. [6] P.L. Bartlett, S. Boucheron and G. Lugosi, Model selection and error estimation. Machine Learn.48 (2002) 85–113. Zbl0998.68117
  7. [7] P.L. Bartlett, M.I. Jordan and J.D. McAuliffe, Convexity, classification, and risk bounds. J. Amer. Stat. Assoc.1001 (2006) 138–156. Zbl1118.62330MR2268032
  8. [8] P.L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: Risk bounds and structural results, in 14th COLT and 5th EuroCOLT (2001) 224–240. Zbl0992.68106MR2042038
  9. [9] P.L. Bartlett and S. Mendelson, Rademacher and Gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. (2002) 463–482. Zbl1084.68549MR1984026
  10. [10] A. Blumer, A. Ehrenfeucht, D. Haussler and M.K. Warmuth, Learnabilty and the Vapnik−Chervonenkis dimension. J. ACM 36 (1989) 929–969. Zbl0697.68079MR1072253
  11. [11] O. Bousquet, S. Boucheron and G. Lugosi, Introduction to statistical learning theory, in Adv. Lect. Machine Learn. (2003) 169–207. Zbl1120.68428
  12. [12] O. Bousquet, S. Boucheron and G. Lugosi, Introduction to statistical learning theory, in Adv. Lect. Machine Learn., vol. 3176 of Lect. Notes in Artificial Intelligence. Springer, Heidelberg (2004) 169–207. Zbl1120.68428
  13. [13] Th. Brodag, PAC-Lernen zur Insolvenzerkennung und Hotspot-Identifikation, Ph.D. thesis, Ph.D. Programme in Computer Science of the Georg-August University School of Science GAUSS (2008). 
  14. [14] N. Cesa-Bianchi, S. Shalev-Shwartz and O. Shamir, Online learning of noisy data. IEEE Trans. Inform. Theory57 (2011) 7907–7931. MR2895368
  15. [15] S.E. Decatur, Learning in hybrid noise environments using statistical queries, in Fifth International Workshop on Artificial Intelligence and Statistics. Lect. Notes Statis. Springer (1993). 
  16. [16] S.E. Decatur, Statistical Queries and Faulty PAC Oracles. COLT (1993) 262–268. 
  17. [17] S.E. Decatur, Efficient Learning from Faulty Data, Ph.D. thesis. Harvard University (1995). MR2693616
  18. [18] S.E. Decatur, PAC learning with constant-partition classification noise and applications to decision tree induction, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91. 
  19. [19] S.E. Decatur and R. Gennaro, On learning from noisy and incomplete examples, in COLT (1995) 353–360. 
  20. [20] L. Devroye, L. Györfi and G. Lugosi, A Probabilistic Theory of Pattern Recognition. Springer, New York (1997). Zbl0853.68150MR1383093
  21. [21] http://www.eclipse.org/jdt/ (2011). 
  22. [22] http://www.eclipe.org/platform/ (2011). 
  23. [23] N. Fenton and S.L. Pfleeger, Software metrics: a rigorous and practical approach. PWS Publishing Co. Boston, MA, USA (1997). Zbl0813.68061
  24. [24] D. Haussler and D. Haussler, Can pac learning algorithms tolerate random attribute noise? Algorithmica14 (1995) 70–84. Zbl0837.68094MR1329816
  25. [25] I. Halperin, H. Wolfson and R. Nussinov, Protein-protein interactions coupling of structurally conserved residues and of hot spots across interfaces. implications for docking. Structure 12 (2004) 1027–1036. 
  26. [26] D. Haussler, Quantifying inductive bias: AI learning algorithms and Valiant’s learning framework. Artificial Intelligence36 (1988) 177–221. Zbl0651.68104MR960589
  27. [27] D. Haussler, M.J. Kearns, N. Littlestone and M.K. Warmuth, Equivalence of models for polynomial learnability. Inform. Comput.95 (1991) 129–161. Zbl0743.68115MR1138115
  28. [28] D. Haussler, D. Haussler and D. Haussler, Calculation and optimization of thresholds for sets of software metrics. Empirical Software Engrg. (2011) 1–30. 10.1007/s10664-011-9162-z. Zbl0762.68050
  29. [29] International Organization of Standardization (ISO) and International Electro-technical Commission (ISEC), Geneva, Switzerland. Software engineering – Product quality, Parts 1-4 (2001-2004). 
  30. [30] G. John and P. Langley, Estimating continuous distributions in bayesian classifiers, In Proc. of the Eleventh Conf. on Uncertainty in Artificial Intelligence. Morgan Kaufmann (1995) 338–345. 
  31. [31] M.J. Kearns, Efficient noise-tolerant learning from statistical queries. J. ACM45 (1998) 983–1006. Zbl1065.68605MR1678849
  32. [32] M.J. Kearns and M. Li, Learning in the presence of malicious errors. SIAM J. Comput.22 (1993) 807–837. Zbl0789.68118MR1227763
  33. [33] M.J. Kearns and R.E. Schapire, Efficient Distribution-Free Learning of Probabilistic Concepts. J. Comput. Syst. Sci.48 (1994) 464–497. Zbl0822.68093MR1279411
  34. [34] V. Koltchinskii, Rademacher penalties and structural risk minimization. IEEE Trans. Inform. Theory47 (2001) 1902–1914. Zbl1008.62614MR1842526
  35. [35] E. Mammen and A.B. Tsybakov, Smooth discrimination analysis. Ann. Statis.27 (1999) 1808–1829. Zbl0961.62058MR1765618
  36. [36] P. Massart, Some applications of concentration inequalities to statistics. Annales de la Faculté des Sciences de Toulouse, volume spécial dédiaé` Michel Talagrand (2000) 245–303. Zbl0986.62002MR1813803
  37. [37] S. Mendelson, Rademacher averages and phase transitions in Glivenko-Cantelli classes. IEEE Trans. Inform. Theory48 (2002) 1977–1991. Zbl1059.60027MR1872178
  38. [38] I.S. Moreira, P.A. Fernandes and M.J. Ramos, Hot spots – A review of the protein-protein interface determinant amino-acid residues. Proteins: Structure, Function, and Bioinformatics, 68 (2007) 803–812. 
  39. [39] D.F. Nettleton, A. Orriols-Puig and A. Fornells, A study of the effect of different types of noise on the precision of supervised learning techniques. Artif. Intell. Rev.33 (2010) 275–306. 
  40. [40] Y. Ofran and B. Rost, ISIS: interaction sites identified from sequence. Bioinform.23 (2007) 13–16. 
  41. [41] Y. Ofran and B. Rost, Protein-protein interaction hotspots carved into sequences. PLoS Comput. Biol. 3 (2007). 
  42. [42] J.C. Platt, Fast training of support vector machines using sequential minimal optimization, in Advances in kernel methods. Edited by B. Schölkopf, Ch.J.C. Burges and A.J. Smola. MIT Press, Cambridge, MA, USA (1999) 185–208. 
  43. [43] J. Ross Quinlan, C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993). 
  44. [44] L. Ralaivola, F. Denis and Ch.N. Magnan, CN = CPCN, in ICML ’06: Proc. of the 23rd int. Conf. Machine learn. ACM New York, NY, USA (2006) 721–728. 
  45. [45] B. Schölkopf and A.J. Smola, Learning with Kernels. MIT Press (2002). Zbl1019.68094
  46. [46] K.S. Thorn and A.A. Bogan, Asedb: a database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics17 (2001) 284–285. 
  47. [47] A.B. Tsybakov, Optimal aggregation of classifiers in statistical learning. Ann. Statis.32 (2004) 135–166. Zbl1105.62353MR2051002
  48. [48] L. Valiant, A theory of learnability. Communic. ACM27 (1984) 1134–1142. Zbl0587.68077
  49. [49] L. Valiant, Learning disjunctions of conjunctions, in Proc. of 9th Int. Joint Conf. Artificial Int. (1985) 560–566. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.