Monograph

On the Two-fold Role of Logic Constraints in Deep Learning

  • Gabriele Ciravegna,

Deep Learning (DL) is a branch of Artificial Intelligence (AI) that focuses on training deep neural networks. Thanks to their ability to process large amounts of data, these networks have achieved remarkable results across a variety of fields. Despite these successes, DL still faces several limitations that hinder its adoption in real-world scenarios. This thesis addresses three key challenges: reducing the need for supervision, defending against adversarial attacks, and explaining neural network behavior. The first two challenges are tackled through learning from constraints, which incorporates domain knowledge to guide the learning process and enhance model robustness. The third challenge, on the other hand, is addressed using learning of constraints, which helps identify and formalize logical relationships among learned tasks, thereby providing interpretable explanations of the networks’ behavior.

  • Keywords:
  • Deep Learning (DL),
  • Logic Constraints,
  • Active Learning,
  • Adversarial Defense,
  • Logic Explanations,
+ Show more
Purchase

Gabriele Ciravegna

Centai Institut, Italy - ORCID: 0000-0002-6799-1043

Gabriele Ciravegna is a researcher at the Centai Institute, working on improving the comprehensibility and robustness of neural networks. His research has been published in leading conferences and journals in the field. He received the IEEE Caianiello Award for the Best PhD Thesis. He teaches Machine Learning and Explainable AI at Politecnico di Torino.
  1. Agrawal, R. (2019). Introduction to deep learning. https://medium.com/@rochak.agrawal/introduction-to-deep-learning-5ffd8b625b00. Accessed: 2021-10-18. DOI: 10.1007/978-1-4842-6579-6_1
  2. Akcay, S., Atapour-Abarghouei, A., and Breckon, T. P. (2018). Ganomaly: Semi-supervised anomaly detection via adversarial training. In Asian Conference on Computer Vision, pages 622-637. Springer. DOI: 10.1007/978-3-030-20893-6_39
  3. Alayrac, J.-B., Uesato, J., Huang, P.-S., Fawzi, A., Stanforth, R., and Kohli, P. (2019). Are labels required for improving adversarial robustness? In Neural Information Processing Systems, pages 12214-12223. DOI: 10.18653/v1/d19-1419
  4. Alvarez-Melis, D. and Jaakkola, T. S. (2018). Towards robust interpretability with self-explaining neural networks. arXiv preprint arXiv:1806.07538.
  5. Andriushchenko, M., Croce, F., Flammarion, N., and Hein, M. (2020). Square Attack: A Query-Efficient Black-Box Adversarial Attack via Random Search. In Vedaldi, A., Bischof, H., Brox, T., and Frahm, J.-M., editors, Computer Vision - ECCV 2020, pages 484-501, Cham. Springer International Publishing.
  6. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., and Rudin, C. (2018). Learning certifiably optimal rule lists for categorical data.
  7. Araujo, A., Meunier, L., Pinot, R., and Negrevergne, B. (2020). Robust Neural Networks using Randomized Adversarial Training. arXiv:1903.10219 [cs, stat]. arXiv: 1903.10219.
  8. Aristotle (350 B.C.). Posterior analytics. DOI: 10.1093/oseo/instance.00262308
  9. Ash, J. T., Zhang, C., Krishnamurthy, A., Langford, J., and Agarwal, A. (2019). Deep batch active learning by diverse, uncertain gradient lower bounds. arXiv preprint arXiv:1906.03671. DOI: 10.1109/lra.2023.3347131/mm1
  10. Athalye, A., Carlini, N., and Wagner, D. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 274-283. PMLR. DOI: 10.22215/etd/2024-16076
  11. Babbar, R. and Scholkopf, B. (2018). Adversarial extreme multi-label classification. arXiv preprint arXiv:1803.01570. DOI: 10.1145/3018661.3018741
  12. Barbiero, P., Ciravegna, G., Giannini, F., Lio, P., Gori, M., and Melacci, S. (2021). Entropy-based logic explanations of neural networks. arXiv preprint arXiv:2106.06804. DOI: 10.1609/aaai.v36i6.20551
  13. Beluch, W. H., Genewein, T., Nurnberger, A., and Kohler, J. M. (2018). The power of ensembles for active learning in image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9368-9377. DOI: 10.1109/cvpr.2018.00976
  14. Bendale, A. and Boult, T. E. (2016). Towards open set deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1563-1572. DOI: 10.1109/cvpr.2016.173
  15. Betti, A., Gori, M., and Melacci, S. (2019). Cognitive action laws: The case of visual features. IEEE transactions on neural networks and learning systems. DOI: 10.1109/tnnls.2019.2911174
  16. Bibal, A. and Frenay, B. (2016). Interpretability of machine learning models and representations: an introduction. In ESANN. DOI: 10.14428/esann/2024.es2024-6
  17. Biggio, B., Corona, I., Maiorca, D., Nelson, B., ≈†rndiƒá, N., Laskov, P., Giacinto, G., and Roli, F. (2013). Evasion attacks against machine learning at test time. In Blockeel, H., Kersting, K., Nijssen, S., and ≈Ωelezny, F., editors, Machine Learning and Knowledge Discovery in Databases (ECML PKDD), Part III, volume 8190 of LNCS, pages 387-402. Springer Berlin Heidelberg. DOI: 10.1007/978-3-642-40994-3_25
  18. Biggio, B. and Roli, F. (2018). Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognition, 84:317-331. DOI: 10.1016/j.patcog.2018.07.023
  19. Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984). Classification and regression trees. CRC press. DOI: 10.1201/9781315139470-8
  20. Brinker, K. (2003). Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th international conference on machine learning (ICML-03), pages 59-66.
  21. Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al. (2020). Language models are few-shot learners. arXiv preprint arXiv:2005.14165. DOI: 10.1109/lra.2023.3347131/mm1
  22. Burbidge, R., Rowland, J. J., and King, R. D. (2007). Active learning for regression based on query by committee. In International conference on intelligent data engineering and automated learning, pages 209-218. Springer.
  23. Cao, X. and Tsang, I. W. (2021). Bayesian active learning by disagreements: A geometric perspective.
  24. Carlini, N., Athalye, A., Papernot, N., Brendel, W., Rauber, J., Tsipras, D., Goodfellow, I., Madry, A., and Kurakin, A. (2019). On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705. 10.1167/19.10.190c DOI: 10.1167/19.10.190c
  25. Carlini, N. and Wagner, D. (2017a). Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the ACM Workshop on Artificial Intelligence and Security, pages 3-14.
  26. Carlini, N. and Wagner, D. A. (2017b). Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39-57. IEEE Computer Society.
  27. Carmon, Y., Raghunathan, A., Schmidt, L., Duchi, J. C., and Liang, P. S. (2019). Unlabeled data improves adversarial robustness. In Neural Information Processing Systems, pages 11190-11201. DOI: 10.18653/v1/d19-1423
  28. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., and Elhadad, N. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 1721-1730.
  29. Carvalho, D. V., Pereira, E. M., and Cardoso, J. S. (2019). Machine learning interpretability: A survey on methods and metrics. Electronics, 8(8):832.
  30. Castro, J. L. and Trillas, E. (1998). The logic of neural networks. Mathware and Soft Computing, 5:23-37. DOI: 10.1007/978-1-4757-2937-5_16
  31. Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., and Yuille, A. (2014). Detect what you can: Detecting and representing objects using holistic models and body parts. DOI: 10.1109/cvpr.2014.254
  32. Chen, Z., Bei, Y., and Rudin, C. (2020). Concept whitening for interpretable image recognition. Nature Machine Intelligence, 2(12):772-782. DOI: 10.1038/s42256-020-00265-z
  33. Choi, J., Elezi, I., Lee, H.-J., Farabet, C., and Alvarez, J. M. (2021). Active learning for deep object detection via probabilistic modeling.
  34. Ciravegna, G., Barbiero, P., Giannini, F., Gori, M., Lio, P., Maggini, M., and Melacci, S. (2021). Logic explained networks. arXiv preprint arXiv:2108.05149. DOI: 10.1016/j.artint.2022.103822
  35. Ciravegna, G., Giannini, F., Gori, M., Maggini, M., and Melacci, S. (2020a). Human-driven fol explanations of deep learning. In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}, pages 2234-2240. International Joint Conferences on Artificial Intelligence Organization.
  36. Ciravegna, G., Giannini, F., Melacci, S., Maggini, M., and Gori, M. (2020b). A constraint-based approach to learning and explanation. In AAAI, pages 3658-3665. DOI: 10.1609/aaai.v34i04.5774
  37. Cohen, W. W. (1995). Fast effective rule induction. In Machine learning proceedings 1995, pages 115-123. Elsevier. DOI: 10.1016/b978-1-55860-377-6.50023-2
  38. Coppedge, M., Gerring, J., Knutsen, C. H., Lindberg, S. I., Teorell, J., Altman, D., Bernhard, M., Cornell, A., Fish, M. S., Gastaldi, L., et al. (2021). V-dem codebook v11.
  39. Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and brain sciences, 24(1):87-114.
  40. Croce, F. and Hein, M. (2020a). Minimally distorted adversarial examples with a fast adaptive boundary attack. In III, H. D. and Singh, A., editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 2196-2205, Virtual. PMLR.
  41. Croce, F. and Hein, M. (2020b). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International Conference on Machine Learning, pages 1-12. DOI: 10.1109/satml59370.2024.00028
  42. Das, A. and Rad, P. (2020). Opportunities and challenges in explainable artificial intelligence (xai): A survey. arXiv preprint arXiv:2006.11371. DOI: 10.14744/iacapaparxiv.2020.20003
  43. d‚ÄôAvila Garcez, A. S., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., and Tran, S. N. (2019). Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning. Journal of Applied Logics - IfCoLog Journal, 6(4):611-632. DOI: 10.1007/978-1-4471-0211-3_9
  44. De Raedt, L. and Kimmig, A. (2015). Probabilistic (logic) programming concepts. Machine Learning, 100(1):5-47.
  45. Demontis, A., Melis, M., Pintor, M., Jagielski, M., Biggio, B., Oprea, A., Nita-Rotaru, C., and Roli, F. (2019). Why do adversarial attacks transfer? Explaining transferability of evasion and poisoning attacks. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association. DOI: 10.1109/sp.2018.00057
  46. Di Nola, A., Gerla, B., and Leustean, I. (2013). Adding real coefficients to ≈Çukasiewicz logic: An application to neural networks. In International Workshop on Fuzzy Logic and Applications, pages 77-85. Springer. DOI: 10.1007/978-3-319-03200-9_9
  47. Diligenti, M., Gori, M., and Sacca, C. (2017). Semantic-based regularization for learning and inference. Artificial Intelligence, 244:143-165. DOI: 10.1016/j.artint.2015.08.011
  48. Donadello, I., Serafini, L., and Garcez, A. D. (2017). Logic tensor networks for semantic image interpretation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI‚Äô17, page 1596-1602. AAAI Press. DOI: 10.24963/ijcai.2017/221
  49. Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. DOI: 10.1007/978-3-319-98131-4_1
  50. Do≈°iloviƒá, F. K., Brƒçiƒá, M., and Hlupiƒá, N. (2018). Explainable artificial intelligence: A survey. In 2018 41st International convention on information and communication technology, electronics and microelectronics (MIPRO), pages 0210-0215. IEEE. DOI: 10.23919/mipro.2018.8400040
  51. Ducoffe, M. and Precioso, F. (2017). Active learning strategy for cnn combining batchwise dropout and query-by-committee. In ESANN. DOI: 10.1109/icip.2010.5653635
  52. Ducoffe, M. and Precioso, F. (2018). Adversarial active learning for deep networks: a margin based approach. arXiv preprint arXiv:1802.09841. DOI: 10.22541/au.149693987.70506124
  53. Erhan, D., Courville, A., and Bengio, Y. (2010). Understanding representations learned in deep architectures. Department dInformatique et Recherche Operationnelle, University of Montreal, QC, Canada, Tech. Rep, 1355(1). DOI: 10.15376/frc.2005.2.1269
  54. EUGDPR (2017). Gdpr. general data protection regulation. DOI: 10.1211/pj.2017.20203048
  55. Freitas, A. A. (2014). Comprehensible classification models: a position paper. ACM SIGKDD explorations newsletter, 15(1):1-10.
  56. Fu, L. (1991). Rule learning by searching on adapted nets. In AAAI, volume 91, pages 590-595. DOI: 10.1109/ijcnn.1991.170488
  57. Gal, Y., Islam, R., and Ghahramani, Z. (2017). Deep bayesian active learning with image data. DOI: 10.1038/s41598-019-50587-1
  58. Ghorbani, A., Wexler, J., Zou, J., and Kim, B. (2019). Towards automatic concept-based explanations. arXiv preprint arXiv:1902.03129. DOI: 10.22541/au.149693987.70506124
  59. Gnecco, G., Gori, M., Melacci, S., and Sanguineti, M. (2015). Foundations of support constraint machines. Neural computation, 27(2):388-480. DOI: 10.1162/neco_a_00686
  60. Goldberger, A. L., Amaral, L. A., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., and Stanley, H. E. (2000). Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. circulation, 101(23):e215-e220. DOI: 10.1161/01.cir.101.23.e215
  61. Gong, T., Liu, B., Chu, Q., and Yu, N. (2019). Using multi-label classification to improve object detection. Neurocomputing, 370:174-185. DOI: 10.1016/j.neucom.2019.08.089
  62. Goodfellow, I., McDaniel, P., and Papernot, N. (2018). Making machine learning robust against adversarial inputs. Communications of the ACM, 61(7):56-66. DOI: 10.1145/3134599
  63. Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Explaining and harnessing adversarial examples. In International Conference on Learning Representations. DOI: 10.5220/0006123702260234
  64. Gori, M. and Melacci, S. (2013). Constraint verification with kernel machines. IEEE Trans. Neural Networks Learn. Syst., 24(5):825-831. DOI: 10.1109/tnnls.2013.2241787
  65. Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwinska, A., Colmenarejo, S. G., Grefenstette, E., Ramalho, T., Agapiou, J. P., Badia, A. P., Hermann, K. M., Zwols, Y., Ostrovski, G., Cain, A., King, H., Summerfield, C., Blunsom, P., Kavukcuoglu, K., and Hassabis, D. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538:471-476. DOI: 10.1038/nature20101
  66. Grosse, K., Manoharan, P., Papernot, N., Backes, M., and McDaniel, P. (2017). On the (statistical) detection of adversarial examples. arXiv preprint arXiv:1702.06280. DOI: 10.1007/978-3-319-66399-9_4
  67. Guidotti, R., Monreale, A., Ruggieri, S., Pedreschi, D., Turini, F., and Giannotti, F. (2018a). Local rule-based explanations of black box decision systems. arXiv preprint arXiv:1805.10820. DOI: 10.1609/aaai.v33i01.33019780
  68. Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. (2018b). A survey of methods for explaining black box models. ACM computing surveys (CSUR), 51(5):93. DOI: 10.1145/3236009
  69. Gunning, D. (2017). Explainable artificial intelligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web, 2(2).
  70. Han, J., Pei, J., and Yin, Y. (2000). Mining frequent patterns without candidate generation. ACM sigmod record, 29(2):1-12.
  71. Harmon, S. A., Sanford, T. H., Xu, S., Turkbey, E. B., Roth, H., Xu, Z., Yang, D., Myronenko, A., Anderson, V., Amalou, A., et al. (2020). Artificial intelligence for the detection of covid-19 pneumonia on chest ct using multinational datasets. Nature communications, 11(1):1-7.
  72. Hastie, T. and Tibshirani, R. (1987). Generalized additive models: some applications. Journal of the American Statistical Association, 82(398):371-386.
  73. Haussmann, E., Fenzi, M., Chitta, K., Ivanecky, J., Xu, H., Roy, D., Mittel, A., Koumchatzky, N., Farabet, C., and Alvarez, J. M. (2020). Scalable active learning for object detection.
  74. He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778.
  75. Hein, M., Andriushchenko, M., and Bitterwolf, J. (2019). Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 41-50.
  76. Hendrycks, D. and Gimpel, K. (2017). A baseline for detecting misclassified and out-of-distribution examples in neural networks. Proceedings of International Conference on Learning Representations.
  77. Hendrycks, D. and Gimpel, K. (2017). Early methods for detecting adversarial images. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. OpenReview.net. DOI: 10.1088/2050-6120/aaa7bf
  78. Hoffmann, R., Minkin, V. I., and Carpenter, B. K. (1996). Ockham’s razor and chemistry. Bulletin de la Societe chimique de France, 2(133):117-130.
  79. Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine learning, 11(1):63-90. DOI: 10.1023/a:1022631118932
  80. Houlsby, N., Huszar, F., Ghahramani, Z., and Lengyel, M. (2011). Bayesian active learning for classification and preference learning. DOI: 10.1103/physreva.85.052120
  81. Huang, S. H. and Xing, H. (2002). Extract intelligible and concise fuzzy rules from neural networks. Fuzzy Sets and Systems, 132(2):233-243.
  82. Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., and Baesens, B. (2011). An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems, 51(1):141-154.
  83. Ichnowski, J., Avigal, Y., Satish, V., and Goldberg, K. (2020). Deep learning can accelerate grasp-optimized motion planning. Science Robotics, 5(48). DOI: 10.1126/scirobotics.abd7710
  84. Joshi, A., Mukherjee, A., Sarkar, S., and Hegde, C. (2019). Semantic adversarial attacks: Parametric transformations that fool deep classifiers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4773-4783. DOI: 10.1109/iccv.2019.00487
  85. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., ≈Ωidek, A., Potapenko, A., et al. (2021). Highly accurate protein structure prediction with alphafold. Nature, 596(7873):583-589. DOI: 10.1038/s41586-021-03819-2
  86. Kasabov, N. K. (1996). Learning fuzzy rules and approximate reasoning in fuzzy neural networks and hybrid systems. Fuzzy sets and Systems, 82(2):135-149.
  87. Kazhdan, D., Dimanov, B., Jamnik, M., Lio, P., and Weller, A. (2020). Now you see me (cme): Concept-based model extraction. arXiv preprint arXiv:2010.13233. DOI: 10.3233/faia200380
  88. Kim, B., Gilmer, J., Wattenberg, M., and Viegas, F. (2018a). Tcav: Relative concept importance testing with linear concept activation vectors. DOI: 10.1111/j.1540-5885.2010.00784.x
  89. Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., et al. (2018b). Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In International conference on machine learning, pages 2668-2677. PMLR. DOI: 10.1109/ijcnn48605.2020.9206946
  90. Kindermans, P.-J., Hooker, S., Adebayo, J., Alber, M., Schutt, K. T., Dahne, S., Erhan, D., and Kim, B. (2019). The (un) reliability of saliency methods. In Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, pages 267-280. Springer. DOI: 10.1007/978-3-030-28954-6_14
  91. Kirsch, A., van Amersfoort, J., and Gal, Y. (2019). Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. 10.1137/1.9781611977653.ch83
  92. Klement, E. P., Mesiar, R., and Pap, E. (2013). Triangular norms, volume 8. Springer Science & Business Media. DOI: 10.1016/b978-044451814-9/50003-3
  93. Koh, P. W., Nguyen, T., Tang, Y. S., Mussmann, S., Pierson, E., Kim, B., and Liang, P. (2020). Concept bottleneck models. DOI: 10.2139/ssrn.4835782
  94. Koller, D., Friedman, N., D≈æeroski, S., Sutton, C., McCallum, A., Pfeffer, A., Abbeel, P., Wong, M.-F., Meek, C., Neville, J., et al. (2007). Introduction to statistical relational learning. MIT press. DOI: 10.7551/mitpress/7432.003.0008
  95. Krzywinski, M. and Altman, N. (2013). Error bars: the meaning of error bars is often misinterpreted, as is the statistical significance of their overlap. Nature methods, 10(10):921-923. 10.1038/nmeth.2659 DOI: 10.1211/pj.2017.20203048
  96. LeCun, Y. (1998). The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/. DOI: 10.32614/cran.package.idx2r
  97. Letham, B., Rudin, C., McCormick, T. H., Madigan, D., et al. (2015). Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. Annals of Applied Statistics, 9(3):1350-1371. DOI: 10.1214/15-aoas848
  98. Liu, B. (2007). Web data mining: exploring hyperlinks, contents, and usage data. Springer Science & Business Media.
  99. Loshchilov, I. and Hutter, F. (2017). Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. DOI: 10.22541/au.149693987.70506124
  100. Lou, Y., Caruana, R., and Gehrke, J. (2012). Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 150-158. DOI: 10.1145/2339530.2339556
  101. Lundberg, S. and Lee, S.-I. (2017). A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874. DOI: 10.22541/au.149693987.70506124
  102. Ma, W. J., Husain, M., and Bays, P. M. (2014). Changing concepts of working memory. Nature neuroscience, 17(3):347. DOI: 10.1038/nn.3655
  103. Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S., Schoenebeck, G., Houle, M. E., Song, D., and Bailey, J. (2018). Characterizing adversarial subspaces using local intrinsic dimensionality. In International Conference on Learning Representations.
  104. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations.
  105. Marcus, G. (2018). Deep learning: A critical appraisal. arXiv preprint arXiv:1801.00631. DOI: 10.22541/au.149693987.70506124
  106. Marler, R. T. and Arora, J. S. (2004). Survey of multi-objective optimization methods for engineering. Structural and multidisciplinary optimization, 26(6):369-395. DOI: 10.1007/s00158-003-0368-6
  107. Marra, G., Giannini, F., Diligenti, M., and Gori, M. (2019). Lyrics: A general interface layer to integrate logic inference and deep learning. arXiv preprint arXiv:1903.07534. DOI: 10.1007/978-3-030-46147-8_17
  108. McCallumzy, A. K. and Nigamy, K. (1998). Employing em and pool-based active learning for text classification. In Proc. International Conference on Machine Learning (ICML), pages 359-367. Citeseer. DOI: 10.1145/1102351.1102445
  109. McCluskey, E. J. (1956). Minimization of boolean functions. The Bell System Technical Journal, 35(6):1417-1444. DOI: 10.1002/j.1538-7305.1956.tb03835.x
  110. McColl, H. (1878). The calculus of equivalent statements (third paper). Proceedings of the London Mathematical Society, 1(1):16-28.
  111. McKelvey, R. D. and Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4(1):103-120.
  112. Melacci, S. and Belkin, M. (2011). Laplacian support vector machines trained in the primal. Journal of Machine Learning Research, 12(Mar):1149-1184. DOI: 10.1007/978-3-642-40728-4_24
  113. Melacci, S., Ciravegna, G., Sotgiu, A., Demontis, A., Biggio, B., Gori, M., and Roli, F. (2020). Domain knowledge alleviates adversarial attacks in multi-label classifiers. arXiv preprint arXiv:2006.03833. DOI: 10.1109/tpami.2021.3137564
  114. Melacci, S., Globo, A., and Rigutini, L. (2018). Enhancing modern supervised word sense disambiguation models by semantic lexical resources. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). DOI: 10.18653/v1/d19-1359
  115. Melacci, S. and Gori, M. (2012). Unsupervised learning by minimal entropy encoding. IEEE transactions on neural networks and learning systems, 23(12):1849-1861.
  116. Melacci, S., Maggini, M., and Gori, M. (2009). Semi-supervised learning with constraints for multi-view object recognition. In International Conference on Artificial Neural Networks, pages 653-662. Springer. DOI: 10.1007/978-3-642-04277-5_66
  117. Melis, M., Demontis, A., Biggio, B., Brown, G., Fumera, G., and Roli, F. (2017). Is deep learning safe for robot vision? Adversarial examples against the iCub humanoid. In ICCVW Vision in Practice on Autonomous Robots (ViPAR), pages 751-759. IEEE. DOI: 10.1109/iccvw.2017.94
  118. Mendelson, E. (2009). Introduction to mathematical logic. CRC press. DOI: 10.1201/b18519
  119. Miller, D. J., Xiang, Z., and Kesidis, G. (2020). Adversarial learning targeting deep neural network classification: A comprehensive review of defenses against attacks. Proceedings of the IEEE, 108(3):402-433. DOI: 10.1109/jproc.2020.2970615
  120. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review, 63:81-97. DOI: 10.1037/h0043158
  121. Minsky, M. and Papert, S. A. (2017). Perceptrons: An introduction to computational geometry. MIT press. DOI: 10.7551/mitpress/11301.001.0001
  122. Miyato, T., Maeda, S.-i., Koyama, M., and Ishii, S. (2018). Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE transactions on pattern analysis and machine intelligence, 41(8):1979-1993. DOI: 10.1109/tpami.2018.2858821
  123. Miyato, T., Maeda, S.-i., Koyama, M., Nakae, K., and Ishii, S. (2016). Distributional smoothing with virtual adversarial training. In International Conference on Learning Representation. DOI: 10.1109/tpami.2018.2858821
  124. Molnar, C. (2020). Interpretable machine learning. Lulu. com. DOI: 10.21105/joss.00786
  125. Morgado, P. and Vasconcelos, N. (2017). Semantically consistent regularization for zero-shot recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6060-6069. DOI: 10.1109/cvpr.2017.220
  126. Mpagouli, A. and Hatzilygeroudis, I. (2007). Converting first order logic into natural language: A first level approach. In Current Trends in Informatics: 11th Panhellenic Conference on Informatics, PCI, pages 517-526. DOI: 10.1007/978-3-642-22194-1_14
  127. Najafi, A., Maeda, S.-i., Koyama, M., and Miyato, T. (2019). Robustness to adversarial perturbations in learning from incomplete data. In Neural Information Processing Systems, pages 5542-5552. DOI: 10.1016/b978-0-12-824020-5.00018-1
  128. Naseer, M. M., Khan, S. H., Khan, M. H., Shahbaz Khan, F., and Porikli, F. (2019). Cross-domain transferability of adversarial perturbations. Advances in Neural Information Processing Systems, 32:12905-12915. DOI: 10.1109/iccv48922.2021.00761
  129. Nielsen, M. A. (2015). Neural networks and deep learning, volume 25. Determination press San Francisco, CA.
  130. Papernot, N., McDaniel, P., and Goodfellow, I. (2016a). Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277. DOI: 10.1145/3052973.3053009
  131. Papernot, N., McDaniel, P., Wu, X., Jha, S., and Swami, A. (2016b). Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP), pages 582-597. IEEE. DOI: 10.1109/sp.2016.41
  132. Park, S., Park, J., Shin, S.-J., and Moon, I.-C. (2018). Adversarial dropout for supervised and semi-supervised learning. In Thirty-Second AAAI Conference on Artificial Intelligence. DOI: 10.1609/aaai.v32i1.11634
  133. Pemstein, D., Marquardt, K. L., Tzelgov, E., Wang, Y.-t., Krusell, J., and Miri, F. (2018). The v-dem measurement model: latent variable analysis for cross-national and cross-temporal expert-coded data. V-Dem Working Paper, 21.
  134. Pi, T., Li, X., and Zhang, Z. M. (2017). Boosted zero-shot learning with semantic correlation regularization. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, pages 2599-2605. DOI: 10.24963/ijcai.2017/362
  135. Pop, R. and Fulop, P. (2018). Deep ensemble bayesian active learning : Addressing the mode collapse issue in monte carlo dropout via ensembles. DOI: 10.11606/d.45.2019.tde-17032019-222659
  136. Quine, W. V. (1952). The problem of simplifying truth functions. The American mathematical monthly, 59(8):521-531. DOI: 10.1080/00029890.1952.11988183
  137. Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1):81-106.
  138. Quinlan, J. R. (2014). C4. 5: programs for machine learning. Elsevier. 10.1109/icmla.2014.116
  139. Ras, G., van Gerven, M., and Haselager, P. (2018). Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable and Interpretable Models in Computer Vision and Machine Learning, pages 19-36. Springer. DOI: 10.1007/978-3-319-98131-4_2
  140. Rathmanner, S. and Hutter, M. (2011). A philosophical treatise of universal induction. Entropy, 13(6):1076-1136. DOI: 10.3390/e13061076
  141. Ribeiro, M. T., Singh, S., and Guestrin, C. (2016). "Why should I trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135-1144. DOI: 10.1145/2939672.2939778
  142. Ribeiro, M. T., Singh, S., and Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. DOI: 10.1609/aaai.v32i1.11491
  143. Rivest, R. L. (1987). Learning decision lists. Machine learning, 2(3):229-246.
  144. Rudin, C., Chen, C., Chen, Z., Huang, H., Semenova, L., and Zhong, C. (2021). Interpretable machine learning: Fundamental principles and 10 grand challenges. arXiv preprint arXiv:2103.11251.
  145. Russell, S. J. and Norvig, P. (2016). Artificial intelligence: a modern approach. Malaysia
  146. Sabour, S., Frosst, N., and Hinton, G. E. (2017). Dynamic routing between capsules. arXiv preprint arXiv:1710.09829. DOI: 10.22541/au.149693987.70506124
  147. Saeed, M., Villarroel, M., Reisner, A. T., Clifford, G., Lehman, L.-W., Moody, G., Heldt, T., Kyaw, T. H., Moody, B., and Mark, R. G. (2011). Multiparameter intelligent monitoring in intensive care ii (mimic-ii): a public-access intensive care unit database. Critical care medicine, 39(5):952.
  148. Samangouei, P., Kabkab, M., and Chellappa, R. (2018). Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In International Conference on Learning Representations.
  149. Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J., and Muller, K.-R. (2020). Toward interpretable machine learning: Transparent deep neural networks and beyond. arXiv preprint arXiv:2003.07631.
  150. Santoro, A., Raposo, D., Barrett, D. G. T., Malinowski, M., Pascanu, R., Battaglia, P., and Lillicrap, T. (2017). A simple neural network module for relational reasoning. DOI: 10.1017/s0140525x19001365
  151. Sato, M. and Tsukimoto, H. (2001). Rule extraction from neural networks via decision tree induction. In IJCNN’01. International Joint Conference on Neural Networks. Proceedings (Cat. No. 01CH37222), volume 3, pages 1870-1875. IEEE.
  152. Schohn, G. and Cohn, D. (2000). Less is more: Active learning with support vector machines. Machine Learning-International Workshop then Conference. DOI: 10.1109/mlsp.2012.6349736
  153. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618-626.
  154. Sener, O. and Savarese, S. (2018). Active learning for convolutional neural networks: A core-set approach. DOI: 10.1007/978-1-4842-3790-8_8
  155. Shafahi, A., Huang, W. R., Studer, C., Feizi, S., and Goldstein, T. (2019). Are adversarial examples inevitable? In International Conference on Learning Representations. DOI: 10.1109/iccv.2019.00660
  156. Sheatsley, R., Hoak, B., Pauley, E., Beugin, Y., Weisman, M. J., and McDaniel, P. (2021). On the robustness of domain constraints. arXiv preprint arXiv:2105.08619. DOI: 10.1145/3460120.3484570
  157. Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological review, 63(2):129. DOI: 10.1037/h0042769
  158. Simon, H. A. (1957). Models of man
  159. Simon, H. A. (1979). Rational decision making in business organizations. The American economic review, 69(4):493-513. DOI: 10.2307/142819
  160. Simonyan, K., Vedaldi, A., and Zisserman, A. (2013). Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034. DOI: 10.5244/c.28.6
  161. Soklakov, A. N. (2002). Occam‚Äôs razor as a formal basis for a physical theory. Foundations of Physics Letters, 15(2):107-135. DOI: 10.1023/a:1020994407185
  162. Solso, R. L., MacLin, M. K., and MacLin, O. H. (2005). Cognitive psychology. Pearson Education New Zealand. DOI: 10.1037/10517-057
  163. Song, Q., Jin, H., Huang, X., and Hu, X. (2018). Multi-label adversarial perturbations. In 2018 IEEE International Conference on Data Mining (ICDM), pages 1242-1247.
  164. Sotgiu, A., Demontis, A., Melis, M., Biggio, B., Fumera, G., Feng, X., and Roli, F. (2020). Deep neural rejection against adversarial examples. EURASIP J. Information Security, 2020(5). DOI: 10.1186/s13635-020-00105-y
  165. Su, J., Vargas, D. V., and Sakurai, K. (2019). One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation, 23(5):828-841. DOI: 10.1109/tevc.2019.2890858
  166. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014a). Intriguing properties of neural networks. In International Conference on Learning Representations. DOI: 10.1109/cvpr.2014.276
  167. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014b). Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014. DOI: 10.1109/cvpr.2014.276
  168. Tavares, A. R., Avelar, P., Flach, J. M., Nicolau, M., Lamb, L. C., and Vardi, M. (2020). Understanding boolean function learnability on deep neural networks. arXiv preprint arXiv:2009.05908. 10.22541/au.149693987.70506124 DOI: 10.14744/iacapaparxiv.2020.20003
  169. Teso, S. (2019). Does symbolic knowledge prevent adversarial fooling? arXiv preprint arXiv:1912.10834. DOI: 10.22541/au.149693987.70506124
  170. Teso, S. and Kersting, K. (2019). Explanatory interactive machine learning. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 239-245. DOI: 10.1145/3306618.3314293
  171. Thulasidasan, S., Chennupati, G., Bilmes, J. A., Bhattacharya, T., and Michalak, S. (2019). On mixup training: Improved calibration and predictive uncertainty for deep neural networks. Advances in Neural Information Processing Systems, 32. DOI: 10.2172/1525811
  172. Tong, S. and Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of machine learning research, 2(Nov):45-66. DOI: 10.3724/sp.j.1087.2012.01359
  173. Towell, G. G. and Shavlik, J. W. (1993). Extracting refined rules from knowledge-based neural networks. Machine learning, 13(1):71-101. DOI: 10.1007/bf00993103
  174. Tsukimoto, H. (2000). Extracting rules from trained neural networks. IEEE Transactions on Neural networks, 11(2):377-389. DOI: 10.1109/72.839008
  175. Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2011a). The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology. DOI: 10.1109/iccv.2011.6126539
  176. Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2011b). The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology. DOI: 10.1109/iccv.2011.6126539
  177. Winston, P. H. and Horn, B. K. (1986). Lisp. Addison Wesley Pub., Reading, MA. DOI: 10.1177/001698627802200110
  178. Witten, I. H. and Frank, E. (2005). Data mining: Practical machine learning tools and techniques 2nd edition. Morgan Kaufmann, San Francisco. DOI: 10.1186/1475-925x-5-51
  179. Wu, Y., Bamman, D., and Russell, S. (2017). Adversarial training for relation extraction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1778-1783. DOI: 10.18653/v1/d17-1187
  180. Yeh, C.-K., Kim, B., Arik, S., Li, C.-L., Pfister, T., and Ravikumar, P. (2020). On completeness-aware concept-based explanations in deep neural networks. Advances in Neural Information Processing Systems, 33. DOI: 10.3233/faia210362
  181. Ying, R., Bourgeois, D., You, J., Zitnik, M., and Leskovec, J. (2019). Gnnexplainer: Generating explanations for graph neural networks. Advances in neural information processing systems, 32:9240. DOI: 10.1007/978-981-16-6054-2_5
  182. Yoo, D. and Kweon, I. S. (2019). Learning loss for active learning. DOI: 10.1109/cvpr.2019.00018
  183. Yu, F., Seff, A., Zhang, Y., Song, S., Funkhouser, T., and Xiao, J. (2015). Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365. DOI: 10.3390/app10144913
  184. Yu, M. and Dredze, M. (2014). Improving lexical embeddings with semantic knowledge. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 545-550. DOI: 10.3115/v1/p14-2089
  185. Zeiler, M. D. and Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision, pages 818-833. Springer. DOI: 10.1007/978-3-319-10590-1_53
  186. Zhai, R., Cai, T., He, D., Dan, C., He, K., Hopcroft, J., and Wang, L. (2019). Adversarially robust generalization just requires more unlabeled data. arXiv preprint arXiv:1906.00555. DOI: 10.1109/lra.2023.3347131/mm1
  187. Zhao, Z., Guo, Y., Shen, H., and Ye, J. (2020). Adaptive object detection with dual multi-label prediction. In European Conference on Computer Vision, pages 54-69. Springer. DOI: 10.1007/978-3-030-58604-1_4
  188. Zhdanov, F. (2019). Diverse mini-batch active learning.
  189. Zhu, X. and Goldberg, A. B. (2009). Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning, 3(1):1-130.
  190. Zilke, J. R., Loza Mencia, E., and Janssen, F. (2016). Deepred - rule extraction from deep neural networks. In Calders, T., Ceci, M., and Malerba, D., editors, Discovery Science, pages 457-473, Cham. Springer International Publishing. DOI: 10.1007/978-3-319-46307-0_29
PDF
  • Publication Year: 2025
  • Pages: 126
  • eISBN: 979-12-215-0680-8
  • Content License: CC BY 4.0
  • © 2025 Author(s)

XML
  • Publication Year: 2025
  • eISBN: 979-12-215-0681-5
  • Content License: CC BY 4.0
  • © 2025 Author(s)

PRINT
  • Publication Year: 2025
  • Pages: 126
  • ISBN: 979-12-215-0679-2
  • Content License: CC BY 4.0
  • © 2025 Author(s)

Bibliographic Information

Book Title

On the Two-fold Role of Logic Constraints in Deep Learning

Authors

Gabriele Ciravegna

Peer Reviewed

Number of Pages

126

Publication Year

2025

Copyright Information

© 2025 Author(s)

Content License

CC BY 4.0

Metadata License

CC0 1.0

Publisher Name

Firenze University Press

DOI

10.36253/979-12-215-0680-8

ISBN Print

979-12-215-0679-2

eISBN (pdf)

979-12-215-0680-8

eISBN (xml)

979-12-215-0681-5

Series Title

Premio Tesi di Dottorato Città di Firenze

0

Views

Export Citation
Suggested Books

1,409

Open Access Books

in the Catalogue

2,668

Book Chapters

4,459,874

Fulltext
downloads

5,056

Authors

from 1080 Research Institutions

of 66 Nations

71

scientific boards

from 380 Research Institutions

of 43 Nations

1,311

Referees

from 402 Research Institutions

of 38 Nations