Exam Learning Documentation

Notes from new things learned during taking of college board exam/quiz

  • Citizen science: emphasizes the role of technology and computational tools in enabling the active participation of individuals in scientific research and data-driven projects
    • Multidisciplinary approach that combines elements of computer science, data analysis, and scientific inquiry to address real-world problems
  • Role of the Internet Engineering Task Force (IETF): plays a central role in shaping the Internet’s technical infrastructure, ensuring its reliability, security, and interoperability
    • Does so through an open, collaborative process that involves experts from various backgrounds, making it a key player in the continued development of the Internet
  • (Quite obvious, but…) There is a large amount of information that can be derived simply from downloading or interacting with a website and its features
    • Downloading is a prime example of this
      • Feedback will be transformed into similar products that follows the user’s supposed interests being advertised on other sites/applications on the device
  • Metadata: an essential component of data systems, ensuring that data is not just a collection of bits and bytes but a valuable and organized resource
    • Ex: think websites, documents, digital media, and data management, information retrieval, archiving/preservation, etc.
  • Byte Pair Encoding (BPE): a data compression and text tokenization algorithm that’s widely used in natural language processing (NLP) and machine learning tasks, particularly in the context of subword tokenization. It was originally developed for data compression but has found extensive applications in NLP, especially in the training of neural language models like GPT (Generative Pre-trained Transformer) models
    • Summary/uses: often used as a preprocessing step in NLP tasks to segment text into subword units, which can improve the performance of various NLP models, especially in tasks like machine translation, text generation, and sentiment analysis
  • Bits (short for “binary digits”): basic units of digital information in the binary number system
    • Used to represent data, perform computations, and enable digital communication
    • Bytes, groups of 8 bits, are more practical for most real-world applications, but bits remain the fundamental building blocks of all digital data
  • Using 128-bit addresses instead of 32-bit addresses allows for a vastly larger address space, which is crucial for accommodating the growing number of devices on the internet and mitigating address exhaustion issues
    • While it comes with increased complexity, it is a necessary step to future-proof networking and ensure the continued expansion of the internet
  • In public key cryptography, the recipient uses their own private key to decrypt a message that has been encrypted with their public key
    • Process:
      • The sender obtains the recipient’s public key
      • The sender encrypts the message using the recipient’s public key
      • The recipient, who is the only one with access to the corresponding private key, can decrypt the message using their private key
  • Symmetric encryption: a cryptographic technique that relies on a single shared secret key to both encrypt and decrypt data
    • The key is kept confidential and must be securely shared between the sender and recipient of the encrypted information. When data is encrypted using this key, it is transformed into ciphertext, which can only be reverted to its original plaintext form using the same key.
  • Symmetric encryption algorithms, such as the Advanced Encryption Standard (AES), are known for their speed and efficiency, making them suitable for a wide range of applications, including secure communication, file encryption, and data storage
    • However, one of the primary challenges with symmetric encryption is securely distributing and managing the encryption key, as any compromise of the key can lead to the compromise of the encrypted data
  • Crowdsourcing: the collection of information, opinions, or work from a group of people, usually sourced via the Internet
    • Crowdsourcing work allows companies to save time and money while tapping into people with different skills or thoughts from all over the world
    • The internet serves as a powerful enabler for crowdsourcing by providing the infrastructure, communication channels, data accessibility, and collaboration tools needed to engage a diverse and global crowd in various projects and problem-solving endeavors