Sunday, January 19, 2025

Security Certificates

 

1. Cryptography Basics

  • Understand Key Concepts:
    • Encryption, decryption, hashing, and digital signatures.
    • Key terms: confidentiality, integrity, authentication, non-repudiation.
  • Learn Symmetric Cryptography:
    • Algorithms: AES, DES, ChaCha20.
    • Modes of operation: ECB, CBC, GCM.
  • Learn Asymmetric Cryptography:
    • Algorithms: RSA, ECC, DSA.
    • Key exchange (Diffie-Hellman).
  • Practical Steps:
    • Use OpenSSL for simple encryption and decryption.

2. Keys: Private vs. Public

  • Private Key:
    • Always secret, used for decryption and signing.
    • Stored securely (e.g., in HSM or keystore).
  • Public Key:
    • Shared openly, used for encryption and signature verification.
  • Master Public-Private Key Pair Usage:
    • Encrypt a message using the public key and decrypt with the private key.

3. X.509 Certificates

  • What is X.509?
    • A standard format for public key certificates.
    • Includes details like subject, issuer, validity period, and public key.
  • Practical Work:
    • Generate certificates using OpenSSL.
    • Create a Certificate Authority (CA) and sign requests.

4. PKCS Standards

  • Key Standards:
    • PKCS#11: Cryptographic token interface (e.g., HSM integration).
    • PKCS#12: Personal Information Exchange format for private keys and certificates.
    • PKCS#7: Cryptographic message syntax for signed/encrypted data.
  • Practical Work:
    • Use pkcs11-tool and OpenSSL to explore PKCS implementations.

5. PEM vs. CERT

  • PEM (Privacy-Enhanced Mail):
    • Base64 encoded certificate format.
    • Common extensions: .pem, .crt, .cer.
  • CERT:
    • Binary certificate format.
    • Common extensions: .der, .crt.
  • Practical Comparison:
    • Convert between PEM and CERT using OpenSSL:
      openssl x509 -in cert.pem -outform der -out cert.der

6. Keystore and Truststore

  • Keystore:
    • Stores private keys and associated certificates.
    • Used by applications to establish their identity.
  • Truststore:
    • Stores trusted certificates (usually CA certificates).
    • Verifies the identity of external systems.
  • Tools:
    • Use Java keytool to manage keystores and truststores.

  • Certificate (X.509 Certificate):

    • A digital certificate (e.g., X.509) is a document that binds a public key to an entity (such as a person, organization, or website).
    • Contains:
      • The public key of the entity.
      • Information about the entity (name, organization, etc.).
      • Issuer details (Certificate Authority - CA).
      • Validity period (start and expiry dates).
      • Signature of the CA to ensure authenticity.
    • Does not contain: The private key.

    Example:
    A website's SSL/TLS certificate contains the site's public key but not the private key.

  • Key (Public/Private Key):

    • A key pair consists of:
      • Public key: Used for encryption or signature verification.
      • Private key: Used for decryption or signing (must be kept secret).
    • A key itself does not contain a certificate, but it can be associated with a certificate.
  • How They Work Together:

    • When a certificate is issued, it includes the public key that corresponds to a private key stored securely by the certificate owner.
    • When encrypting or verifying data, the certificate's public key is used.
    • The corresponding private key (not in the certificate) is used to decrypt or sign.
  • Sunday, December 29, 2024

    Kubernates

    Kubernetes Architecture Diagram


    Kubernetes architecture is designed to manage and orchestrate containerized applications efficiently. It consists of a Master Node (control plane) and multiple Worker Nodes, along with various components that communicate and collaborate to ensure application reliability, scalability, and high availability.




     

    Kubernetes Architecture Overview

    1. Master Node (Control Plane)

    The master node is responsible for managing the cluster, maintaining the desired state of applications, and scheduling workloads.

    Key Components of the Master Node

    1. API Server:

      • Acts as the cluster's front-end.
      • Receives REST API requests from users, tools, and other components.
      • Validates and processes the requests, and updates the cluster's state in etcd.
    2. Scheduler:

      • Assigns work (Pods) to worker nodes based on resource availability, constraints, and policies.
      • Ensures efficient utilization of cluster resources.
    3. Controller Manager:

      • Runs various controllers (control loops) to ensure the desired state of the cluster.
      • Types of controllers:
        • Node Controller: Monitors node health.
        • Replication Controller: Ensures the desired number of pod replicas.
        • Endpoint Controller: Manages service and pod relationships.
        • Service Account Controller: Manages service accounts and API tokens.
    4. etcd:

      • A distributed key-value store used for storing cluster state and configuration data.
      • Provides consistency and high availability for cluster metadata.

    2. Worker Node

    Worker nodes run application workloads and manage containers. Each worker node is responsible for running Pods.

    Key Components of a Worker Node

    1. Kubelet:

      • An agent that runs on each worker node.
      • Ensures containers are running in the desired state as specified by the control plane.
      • Communicates with the API Server.
    2. Container Runtime:

      • Software responsible for running containers (e.g., Docker, containerd, CRI-O).
      • Interfaces with Kubernetes using the Container Runtime Interface (CRI).
    3. Kube Proxy:

      • Manages networking for Pods.
      • Implements network rules to allow communication between Pods and Services.
      • Supports various networking modes (e.g., IP tables or IPVS).
    4. Pod:

      • The smallest deployable unit in Kubernetes, which encapsulates one or more containers, storage resources, and networking.
      • All Pods on a worker node are scheduled by the Master Node.

    3. Add-Ons

    Additional components that extend Kubernetes functionality:

    • Dashboard: A web UI for managing and monitoring the cluster.
    • DNS: Internal DNS for resolving service names within the cluster.
    • Ingress Controller: Manages HTTP and HTTPS traffic to applications.
    • Monitoring Tools: Tools like Prometheus or Grafana for monitoring.
    • Logging: Centralized logging solutions like Fluentd or Elasticsearch.

    Key Concepts

    1. Pod

    • The smallest deployable unit in Kubernetes.
    • Encapsulates containers, shared storage, and network.

    2. Service

    • Exposes a group of Pods as a network service.
    • Types:
      • ClusterIP (default): Internal access within the cluster.
      • NodePort: External access via a node's IP and a static port.
      • LoadBalancer: External access via a cloud provider's load balancer.

    3. Ingress

    • Manages external access to services, usually HTTP/HTTPS.

    4. ReplicaSet

    • Ensures a specified number of pod replicas are running at all times.

    5. Deployment

    • Manages updates, rollbacks, and scaling of applications.

    6. Namespace

    • Logical partitioning for resource isolation and organization.

    Benefits of Kubernetes Architecture

    1. High Availability: Redundancy at multiple levels ensures no single point of failure.
    2. Scalability: Automatically scales applications based on demand.
    3. Flexibility: Supports hybrid and multi-cloud deployments.
    4. Self-Healing: Automatically restarts or replaces failed containers.

    Cluster Information

    kubectl cluster-info # View cluster info kubectl get nodes # List nodes in the cluster kubectl get componentstatuses # Check cluster components' health

    Namespaces

    kubectl get namespaces # List all namespaces kubectl create namespace <name> # Create a namespace kubectl delete namespace <name> # Delete a namespace kubectl config set-context --current --namespace=<name> # Set default namespace

    Pods

    kubectl get pods # List all pods in the current namespace kubectl get pods -n <namespace> # List pods in a specific namespace kubectl describe pod <pod-name> # Detailed information about a pod kubectl logs <pod-name> # View logs of a pod kubectl exec -it <pod-name> -- /bin/bash # Access a pod's shell kubectl delete pod <pod-name> # Delete a pod kubectl get pods --field-selector=status.phase=Running # List running pods

    Deployments

    kubectl get deployments # List deployments kubectl describe deployment <name> # Detailed deployment information kubectl apply -f deployment.yaml # Apply a deployment configuration kubectl scale deployment <name> --replicas=<count> # Scale a deployment kubectl rollout restart deployment/<name> # Restart deployment pods
    kubectl edit deployment <name> kubectl delete deployment <name> # Delete a deployment

    Services

    kubectl get services # List all services kubectl describe service <name> # Detailed service information kubectl expose deployment <name> --type=NodePort --port=8080 # Expose a deployment kubectl apply -f service.yaml # Apply a service configuration kubectl delete service <name> # Delete a service

    ConfigMaps & Secrets

    kubectl get configmaps # List all ConfigMaps kubectl create configmap <name> --from-literal=key=value # Create a ConfigMap kubectl describe configmap <name> # Detailed ConfigMap information kubectl delete configmap <name> # Delete a ConfigMap kubectl get secrets # List all Secrets kubectl create secret generic <name> --from-literal=key=value # Create a Secret kubectl describe secret <name> # Detailed Secret information kubectl delete secret <name> # Delete a Secret

    Ingress

    kubectl get ingress # List all Ingress rules kubectl apply -f ingress.yaml # Apply an Ingress configuration kubectl delete ingress <name> # Delete an Ingress

    YAML File Templates

    Pod YAML

    apiVersion: v1 kind: Pod metadata: name: my-pod labels: app: my-app spec: containers: - name: my-container image: nginx ports: - containerPort: 80

    Deployment YAML

    apiVersion: apps/v1 kind: Deployment metadata: name: my-deployment spec: replicas: 3 selector: matchLabels: app: my-app template: metadata: labels: app: my-app spec: containers: - name: my-container image: nginx ports: - containerPort: 80

    Service YAML

    apiVersion: v1 kind: Service metadata: name: my-service spec: selector: app: my-app ports: - protocol: TCP port: 80 targetPort: 80 type: NodePort

    Ingress YAML

    apiVersion: networking.k8s.io/v1
    kind: Ingress metadata: name: my-ingress spec: rules: - host: example.com http: paths: - path: / pathType: Prefix backend: service: name: my-service port: number: 80

    Advanced Commands

    Config & Resources

    kubectl get all # Get all resources in the namespace kubectl top nodes # Show resource usage of nodes kubectl top pods # Show resource usage of pods kubectl apply -f <file> # Apply configuration from a file kubectl edit <resource> <name> # Edit a resource interactively kubectl delete -f <file> # Delete resources defined in a file

    Debugging

    kubectl describe <resource> <name> # Detailed info about a resource
    kubectl logs <pod-name> # Fetch logs from a pod kubectl logs <pod-name> -c <container> # Logs for a specific container kubectl exec -it <pod-name> -- <command> # Run commands inside a container kubectl get events # View cluster events

    Rollouts

    kubectl rollout status deployment/<name> # Check rollout status
    kubectl rollout undo deployment/<name> # Rollback to a previous version




    Wednesday, November 27, 2024

    AWS

     

    • What is IAM rules.
    • Lambda Pros / Cons [ColdStart Latency].
    • How many ways we can invoke a lambda
    • SQS VS SNS
    • S3 usage and different types of data encryption S3.
    • Can create same bucket name with different regions.
    • how to integrate spring boot with lamda.
    • What is ECS and how to deploy micro services and configuration steps each (Cluster creation,service creation,Task defination,Domains,VPN network)
    • How to configure VPN inbound and outbound ECS.
    • What is EC2 and how to deploy spring boot micro service (pros and cons)
    • Difference in Fargate and EC2.
    • What is Fargate 
    • AWS RDS and ECR configuration.
    • Usages of EKS and difference Kubernates.
    • Difference in Dynamo DB and MonogoDb (with index creation).
    • Different types of AWS gateways and Load Balancers.
    • Subnet and VPN configuration for IP restrictions one server to another server.
    • Difference Cloudwatch and cloudtrail.
    • How to integrate spring boot with cloudwatch.
    • What is Elastic bean stack.
    • AWS cli commands for ECS, EC2.

    Saturday, November 23, 2024

    ReactJS

     

    Getting Started

    • Install React:

      npx create-react-app my-app
      cd my-app npm start
    • Basic Component:

      import React from 'react';
      const MyComponent = () => { return <h1>Hello, World!</h1>; }; export default MyComponent;

    JSX (JavaScript XML)

    • Embed expressions in curly braces:

      const name = "React"; <h1>Hello, {name}!</h1>
    • Conditional rendering:

      jsx
      const isLoggedIn = true; <div>{isLoggedIn ? 'Welcome!' : 'Please log in.'}</div>;
    • Apply CSS:

      jsx
      <div style={{ color: 'blue', fontSize: '20px' }}>Styled Text</div>

    Props

    • Passing data to child components:

      const Welcome = ({ name }) => <h1>Hello, {name}!</h1>; <Welcome name="John" />;
    • Default props:

      jsx
      Welcome.defaultProps = { name: 'Guest', };

    State (useState Hook)

    • Manage component state:
      import React, { useState } from 'react';
      const Counter = () => { const [count, setCount] = useState(0); return ( <div> <p>Count: {count}</p> <button onClick={() => setCount(count + 1)}>Increment</button> </div> ); };

    Effect (useEffect Hook)

    • Handle side effects (e.g., fetching data):
      import React, { useEffect, useState } from 'react'; const FetchData = () => { const [data, setData] = useState([]); useEffect(() => { fetch('https://api.example.com/data') .then((response) => response.json()) .then((data) => setData(data)); }, []); // Empty array: runs only on mount return <div>{JSON.stringify(data)}</div>; };

    Events

    • Handling events:

      const handleClick = () => alert('Button clicked!');
      <button onClick={handleClick}>Click Me</button>;
    • Passing parameters:

      jsx
      <button onClick={() => handleClick('React')}>Click Me</button>;

    Forms and Inputs

    • Controlled inputs:
      const [value, setValue] = useState(''); const handleChange = (event) => setValue(event.target.value); return <input type="text" value={value} onChange={handleChange} />;

    Lists

    • Rendering lists:
      const items = ['React', 'Angular', 'Vue']; return ( <ul> {items.map((item, index) => ( <li key={index}>{item}</li> ))} </ul> );

    Routing (React Router)

    • Install React Router:

      npm install react-router-dom
    • Basic example:

      import { BrowserRouter as Router, Route, Link, Switch } from 'react-router-dom'; const App = () => ( <Router> <nav> <Link to="/">Home</Link> <Link to="/about">About</Link> </nav> <Switch> <Route path="/" exact component={() => <h1>Home</h1>} /> <Route path="/about" component={() => <h1>About</h1>} /> </Switch> </Router> );

    Lifecycle Methods

    • Using Hooks for lifecycle:
      • Mounting: useEffect(() => {}, []);
      • Updating: useEffect(() => {}, [dependencies]);
      • Unmounting:

        useEffect(() => { return () => { // Cleanup code here }; }, []);
    • Mounting:
      • constructor()
      • getDerivedStateFromProps()
      • render()
      • ComponentDidMount()
    • Updating
      • getDerivedStateFromProps()
      • ShouldComponentUpdate()
      • render()
      • getSnapshotBeforeUpdate()
      • ComponentDidUpdate()
    • Ummounting
      • ComponentWillUnmount()

    Context API

    • Context for global state:
      import React, { createContext, useContext } from 'react';
      const ThemeContext = createContext('light'); const App = () => ( <ThemeContext.Provider value="dark"> <Toolbar /> </ThemeContext.Provider> ); const Toolbar = () => { const theme = useContext(ThemeContext); return <div>Theme: {theme}</div>; };

    Custom Hooks

    • Create reusable logic:
      const useCounter = (initialValue = 0) => { const [count, setCount] = useState(initialValue); const increment = () => setCount(count + 1); return [count, increment]; }; const Counter = () => { const [count, increment] = useCounter(); return <button onClick={increment}>Count: {count}</button>; };

    Optimizations

    • React.memo: Prevent unnecessary renders.

      const MemoizedComponent = React.memo(MyComponent);
    • useCallback: Memoize functions.

      const memoizedCallback = useCallback(() => { doSomething(); }, [dependencies]);
    • useMemo: Memoize values.

      const memoizedValue = useMemo(() => computeExpensiveValue(a, b), [a, b]);



    Full Expansions
    imr - Import React
    import * as React from "react";

    imrc - Import React, Component
    import * as React from "react";
    import { Component } from "react";

    imrd - Import ReactDOM
    import ReactDOM from "react-dom";

    imrs - Import React, useState
    import * as React from "react";
    import { useState } from "react";

    imrse - Import React, useState, useEffect
    import * as React from "react";
    import { useState, useEffect } from "react";

    impt - Import PropTypes
    import PropTypes from "prop-types";

    impc - Import PureComponent
    import * as React from "react";
    import { PureComponent } from "react";

    cc - Class Component
    class | extends React.Component {
      render() {
        return <div>|</div>
      }
    }

    export default |;

    ccc - Class Component With Constructor
    class | extends Component {
      constructor(props) {
        super(props);
        this.state = { | };
      }
      render() {
        return ( | );
      }
    }

    export default |;

    cpc - Class Pure Component
    class | extends PureComponent {
      state = { | },
      render() {
        return ( | );
      }
    }

    export default |;

    ffc - Function Component
    function (|) {
        return ( | );
    }

    export default |;

    sfc - Stateless Function Component (Arrow function)
    const | = props => {
      return ( | );
    };

    export default |;

    cdm - componentDidMount
    componentDidMount() {
      |
    }

    uef - useEffect Hook
    useEffect(() => {
      |
    }, []);

    ucb - useCallback Hook
    useCallback((val) => {
      |
    }, []);

    cwm - componentWillMount
    //WARNING! To be deprecated in React v17. Use componentDidMount instead.
    componentWillMount() {
      |
    }

    cwrp - componentWillReceiveProps
    //WARNING! To be deprecated in React v17. Use new lifecycle static getDerivedStateFromProps instead.
    componentWillReceiveProps(nextProps) {
      |
    }

    gds - getDerivedStateFromProps
    static getDerivedStateFromProps(nextProps, prevState) {
      |
    }

    scu - shouldComponentUpdate
    shouldComponentUpdate(nextProps, nextState) {
      |
    }

    cwu - componentWillUpdate
    //WARNING! To be deprecated in React v17. Use componentDidUpdate instead.
    componentWillUpdate(nextProps, nextState) {
      |
    }

    cdu - componentDidUpdate
    componentDidUpdate(prevProps, prevState) {
      |
    }

    cwun - componentWillUnmount
    componentWillUnmount() {
      |
    }

    cdc - componentDidCatch
    componentDidCatch(error, info) {
      |
    }

    gsbu - getSnapshotBeforeUpdate
    getSnapshotBeforeUpdate(prevProps, prevState) {
      |
    }

    ss - setState
    this.setState({ | : | });

    ssf - Functional setState
    this.setState(prevState => {
      return { | : prevState.| }
    });

    usf - Declare a new state variable using State Hook
    const [|, set|] = useState();

    Hit Tab to apply CamelCase to function. e.g. [count, setCount]

    ren - render
    render() {
      return (
        |
      );
    }

    rprop - Render Prop
    class | extends Component {
      state = { | },
      render() {
        return this.props.render({
          |: this.state.|
        });
      }
    }

    export default |;

    hoc - Higher Order Component
    function | (|) {
      return class extends Component {
        constructor(props) {
          super(props);
        }

        render() {
          return < | {...this.props} />;
        }
      };
    }

    cpf - Class Property Function
      | = (e) => {
        |
      }

    Tuesday, November 19, 2024

    Kafka

    Kafka architecture is designed as a distributed, scalable, and fault-tolerant system for real-time data streaming. It is widely used for building streaming applications and data pipelines.


    Producer Mechanics

    • Batching and Compression:
      • Producers batch messages to optimize network utilization.
      • Compression (e.g., gzip, Snappy, LZ4, Zstd) reduces message size and improves throughput.
    • Acknowledgment Modes:
      • acks=0: Fire-and-forget, no guarantees.
      • acks=1: Leader acknowledgment; higher throughput, possible data loss.
      • acks=all: Full acknowledgment; durable but slower.

    Consumer Mechanics

    • Offset Management:
      • Consumers track message offsets via Kafka’s internal topic (__consumer_offsets) or externally (e.g., database).
    • Rebalancing:
      • Dynamic assignment of partitions to consumers in a group.
      • Sticky partitioning strategies reduce data re-fetching during rebalances.
    • Commit Strategies:
      • Auto-commit: Automatic offset saving; fast but risk of duplicate processing.
      • Manual commit: Offers control but requires careful management.

    Kafka Storage Insights

    • Retention Policies:
      • Time-based: Retain data for a configured duration.
      • Size-based: Retain data until partition log reaches a set size.
    • Log Compaction:
      • Retains the latest record for a key, useful for change data capture (CDC) or state storage.
    • Tiered Storage (Newer feature):
      • Offloads cold data to cheaper, external storage like S3 or HDFS.

    Kafka Clustering

    • Broker Management:
      • Horizontal scaling by adding brokers.
      • Partition reassignment tools manage data redistribution.

    Monitoring and Optimization

    • Key Metrics:
      • Broker Metrics: Disk I/O, network throughput, replication lag.
      • Producer Metrics: Record send rate, compression ratio, batch size.
      • Consumer Metrics: Fetch lag, commit latency, offset lag.

    Kafka at Scale

    • Capacity Planning:
      • Estimate throughput, storage, and partitioning needs.
    • Scaling Strategies:
      • Dynamic addition of brokers, partition scaling, and topic rebalance.
    • High Throughput:
      • Optimize producer and broker configurations for sustained performance.

    Error handling strategies

    Several options are available for handling messages stored in a dead letter queue:

    • Re-process: Some messages in the DLQ need to be re-processed. However, first, the issue needs to be fixed. The solution can be an automatic script, human interaction to edit the message, or returning an error to the producer asking for re-sending the (corrected) message.
    • Drop the bad messages (after further analysis): Bad messages might be expected depending on your setup. However, before dropping them, a business process should examine them. For instance, a dashboard app can consume the error messages and visualize them.
    • Advanced analytics: Instead of processing each message in the DLQ, another option is to analyze the incoming data for real-time insights or issues. For instance, a simple ksqlDB application can apply stream processing for calculations, such as the average number of error messages per hour or any other insights that help decide on the errors in your Kafka applications.
    • Stop the workflow: If bad messages are rarely expected, the consequence might be stopping the overall business process. The action can either be automated or decided by a human. Of course, stopping the workflow could also be done in the Kafka application that throws the error. The DLQ externalizes the problem and decision-making if needed.
    • Ignore: This might sound like the worst option. Just let the dead letter queue fill up and do nothing. However, even this is fine in some use cases, like monitoring the overall behavior of the Kafka application. Keep in mind that a Kafka topic has a retention time, and messages are removed from the topic aft r that time. Just set this up the right way for you. And monitor the DLQ topic for unexpected behavior (like filling up way too quickly).


    Kafka Questions

    1. Kafka Basics

    1. What is Kafka, and how does it differ from traditional messaging systems like RabbitMQ or ActiveMQ?
    2. Explain Kafka’s architecture and its core components.
    3. How does Kafka ensure fault tolerance?
    4. What is the difference between a topic, partition, and offset in Kafka?
    5. Can Kafka be used as a database? Why or why not?

    2. Kafka Producers

    1. How do Kafka producers achieve high throughput?
    2. What are the acknowledgment (acks) configurations in Kafka, and how do they affect message delivery guarantees?
    3. Explain Kafka producer's batching mechanism.
    4. How does Kafka handle retries and retries with idempotence?
    5. What is the role of the partition key in a Kafka producer? How does it influence message routing?

    3. Kafka Consumers

    1. What is the purpose of consumer groups in Kafka?
    2. Explain how Kafka ensures message delivery semantics: at-least-once, at-most-once, and exactly-once.
    3. How are offsets managed in Kafka? What are the pros and cons of auto-committing offsets?
    4. What is rebalancing in Kafka, and how can it affect consumers?
    5. How would you troubleshoot offset lag in a consumer group?

    4. Kafka Brokers and Clustering

    1. How does Kafka distribute partitions among brokers?
    2. What is ISR (In-Sync Replica) in Kafka, and why is it important?
    3. How does Kafka handle leader election for partitions?
    4. Explain the difference between Kafka’s old ZooKeeper-based architecture and the new KRaft architecture.
    5. What happens when a Kafka broker fails? How is data consistency ensured?

    5. Kafka Storage

    1. How does Kafka handle log segmentation and log compaction?
    2. What are Kafka’s retention policies, and when would you use each type?
    3. How does Kafka achieve high performance with its write-ahead log (WAL) design?
    4. Explain tiered storage in Kafka and its advantages.
    5. What is the role of indexes in Kafka logs, and how do they optimize reads?

    6. Kafka Security

    1. What security features does Kafka provide?
    2. How do SASL and SSL/TLS work in Kafka for authentication and encryption?
    3. What is the purpose of Kafka ACLs, and how do you configure them?
    4. Explain the concept of role-based access control (RBAC) in Kafka.
    5. How would you secure a Kafka cluster in a production environment?

    7. Kafka Operations

    1. How do you monitor the health of a Kafka cluster?
    2. What are some common Kafka metrics, and why are they important?
    3. How would you handle partition reassignment in Kafka?
    4. What are the best practices for scaling a Kafka cluster?
    5. How do you troubleshoot issues like high replication lag or message delays?

    8. Kafka Streams and Kafka Connect

    1. What is Kafka Streams, and how does it differ from Apache Spark Streaming or Flink?
    2. How do stateful operations in Kafka Streams work, and where is the state stored?
    3. Explain the difference between KTable and KStream.
    4. What is Kafka Connect, and how does it help in integrating systems with Kafka?
    5. How would you handle schema evolution in Kafka Connect with tools like Schema Registry?

    9. Advanced Kafka Topics

    1. How does Kafka achieve exactly-once semantics (EOS)?
    2. What are the advantages and limitations of using Kafka for event sourcing?
    3. Explain the concept of a dead letter queue (DLQ) in Kafka.
    4. How does Kafka MirrorMaker 2.0 work for cross-cluster replication?
    5. What strategies would you use to optimize Kafka throughput?

    10. Kafka at Scale

    1. What factors influence Kafka’s partitioning strategy, and how do you determine the number of partitions?
    2. How would you design a Kafka deployment to handle high-throughput workloads?
    3. Explain Kafka’s performance trade-offs when handling large messages.
    4. How would you handle multi-region Kafka deployments?
    5. What are the key considerations for capacity planning in a Kafka cluster?

    11. Real-World Scenarios

    1. How would you design a fault-tolerant Kafka pipeline for a payment system?
    2. What challenges have you faced in Kafka production environments, and how did you resolve them?
    3. Describe a use case where you used Kafka Streams to process real-time data.
    4. How do you handle schema compatibility in Kafka when integrating with multiple systems?
    5. Have you implemented a Kafka monitoring or alerting system? What tools did you use?


    1. What are Kafka’s main components and their roles?

    Answer:

    • Producer: Sends messages to Kafka topics.
    • Consumer: Reads messages from Kafka topics.
    • Broker: Kafka server that stores messages on disk and serves client requests.
    • Topic: A logical channel to which messages are published and read.
    • Partition: A topic is divided into partitions for scalability and parallelism.
    • Offset: A unique identifier for each message within a partition.
    • ZooKeeper/KRaft: (Legacy/New) Responsible for metadata management, leader election, and state coordination.

    2. How does Kafka achieve fault tolerance?

    Answer:

    Kafka achieves fault tolerance through:

    • Replication: Each partition has replicas across brokers. If the leader fails, another replica is promoted.
    • In-Sync Replica (ISR): Replicas synchronized with the leader; ensures consistency.
    • Data Persistence: Messages are stored on disk and survive broker failures.
    • Leader Election: Handles broker or partition leader failure using ZooKeeper or KRaft.

    3. What is the role of a partition in Kafka?

    Answer:

    • Partitions allow Kafka to scale horizontally by distributing data across brokers.
    • Each partition is processed independently, enabling parallelism.
    • Partitions maintain message order within themselves but not across the entire topic.
    • They also play a critical role in replication for fault tolerance.

    4. How does Kafka handle message delivery guarantees?

    Answer:

    • At-least-once: Default; messages are delivered at least once, possible duplicates.
      • Achieved by re-sending if acknowledgments fail.
    • At-most-once: Messages are delivered at most once, possible data loss.
      • Achieved by disabling retries and acknowledgment.
    • Exactly-once: Ensures no duplicates or losses.
      • Achieved using idempotent producers and Kafka transactions.

    5. What is Kafka’s log compaction?

    Answer:

    Log compaction is a mechanism to retain only the latest message for a key in a topic, ensuring:

    • Storage optimization by discarding old values.
    • Supporting use cases like change data capture (CDC) or maintaining up-to-date key-value states.
    • It is controlled by the cleanup.policy=compact configuration.

    6. Explain ZooKeeper’s role in Kafka.

    Answer:

    In Kafka (legacy):

    • Manages metadata like brokers, topics, and partitions.
    • Handles leader election for partitions.
    • Tracks broker heartbeats to detect failures. In newer Kafka versions (KRaft):
    • ZooKeeper is replaced by Kafka-native consensus for managing metadata.

    7. How do Kafka consumers handle offset management?

    Answer:

    Consumers use offsets to track their progress:

    • Automatic Offset Commit: Kafka automatically commits offsets at regular intervals.
    • Manual Offset Commit: Consumers explicitly commit offsets for greater control. Offsets are stored:
    • In Kafka: Default; stored in __consumer_offsets topic.
    • Externally: Custom storage mechanisms (e.g., databases) for advanced use cases.

    8. What are ISR (In-Sync Replicas) and their importance?

    Answer:

    ISR is the set of replicas that are fully synchronized with the partition leader.

    • Importance:
      • Ensures data durability and fault tolerance.
      • Leader election only happens among ISR replicas to prevent data loss.
    • If a replica falls behind, it’s removed from ISR.

    9. What are the configurations for producer acknowledgment (acks)?

    Answer:

    • acks=0: Producer doesn’t wait for acknowledgment. Fast but risky (possible data loss).
    • acks=1: Leader acknowledges once it writes to the log. Balances reliability and performance.
    • acks=all: All ISR replicas acknowledge. Ensures durability at the cost of latency.

    10. How does Kafka achieve exactly-once semantics?

    Answer:

    Kafka achieves exactly-once semantics (EOS) through:

    • Idempotent Producers: Ensures the same message isn’t written twice to the log.
    • Transactions: Groups multiple producer and consumer operations into atomic units.
    • Kafka Streams: Automatically supports EOS for stream processing.

    11. What is rebalancing in Kafka?

    Answer:

    Rebalancing occurs when:

    • A new consumer joins a group.
    • A consumer leaves or fails.
    • Topics or partitions change. Impact:
    • Partitions are reassigned among consumers.
    • May cause temporary unavailability or duplicate processing. Optimization:
    • Use sticky partition assignment strategies to minimize disruptions.

    12. What are Kafka’s retention policies?

    Answer:

    • Time-Based: Retain messages for a configured duration (log.retention.hours).
    • Size-Based: Retain messages until log size reaches a threshold (log.retention.bytes).
    • Log Compaction: Retain the latest record for a key (cleanup.policy=compact).

    13. What is Kafka Streams?

    Answer:

    Kafka Streams is a Java library for building real-time stream processing applications.

    • Features:
      • Supports stateful and stateless transformations.
      • Scales horizontally across multiple instances.
      • Provides fault-tolerant state stores.
    • Example Use Case:
      • Real-time data transformation or aggregations (e.g., computing metrics from logs).

    14. How would you monitor a Kafka cluster?

    Answer:

    Use tools and metrics like:

    • JMX Metrics: Monitor broker, producer, and consumer performance.
    • Prometheus/Grafana: Visual dashboards for Kafka metrics.
    • Key Metrics:
      • Broker: Disk usage, network throughput.
      • Producer: Record send rate, retries.
      • Consumer: Offset lag, fetch latency.
    • Tools: Confluent Control Center, LinkedIn’s Burrow.

    15. How does Kafka handle high throughput?

    Answer:

    • Batching: Combines multiple messages into a single network request.
    • Compression: Reduces message size using gzip, Snappy, or LZ4.
    • Partitioning: Distributes load across brokers for parallel processing.
    • Efficient I/O: Uses sequential disk writes (write-ahead logs).
    • Optimized Configurations:
      • Increase num.partitions and tune producer batch sizes.

    Issues with Resolutions

     

    1.
    Situation / Task
    (Explain the situation or task so others understand the context):

    Partition Queue is block (for emails/ for any events) due to one bad event.
    Action (Give details about what you or another person did to handle the situation):
    Raise the alert and log the exception
    adding Feature flag for logging or catching the exception.
    Result (Describe what was achieved by the action and why it was effective):
    Now the partition queue released all blocked event, and it seems, no blocking queue.


    2.
    Situation / Task (Explain the situation or task so others understand the context):
    Few messages in the queue are missing
    Action (Give details about what you or another person did to handle the situation):
    Previously, we have added event specific multiple listeners, now we maintained in a single endpoint listener and once we got the events, based on the event, split those events into multiple handlers.
    Result (Describe what was achieved by the action and why it was effective):
    Now able to process all messages in the queue.

     

    3.
    Situation / Task (Explain the situation or task so others understand the context):
    Idempotent issue
    Action (Give details about what you or another person did to handle the situation):
    Maintained the key in cache and validating (implemented the logic for idempotent issues.)
    Filtering the duplicates
    Result (Describe what was achieved by the action and why it was effective):
    Now the records are capturing as resilient.


    4.
    Situation / Task (Explain the situation or task so others understand the context):
    Alignment Issue in UX (browser specific)
    Action (Give details about what you or another person did to handle the situation):
    It seems, 1 day task but after revamp the complete block of html div, then the issue is fixed.
    Result (Describe what was achieved by the action and why it was effective):
    In all browsers, it looks good

    5.
    Situation / Task (Explain the situation or task so others understand the context):
    ETL performance issues, each job is executing more than expected time.
    Action (Give details about what you or another person did to handle the situation): Result (Describe what was achieved by the action and why it was effective):

     


    6.
    Situation / Task (Explain the situation or task so others understand the context):
    While using @Async, @Transactional, @Cacheable annotations, Its not working as expected

    Action (Give details about what you or another person did to handle the situation):

    Result (Describe what was achieved by the action and why it was effective):
    Those annotations are Spring specific
    It will not applicable for private method and calling method is in same class
    Because, it will create proxy based, so while instantiate the bean, the proxy intercepts before / after and add few functionalities.

    7. Situation / Task (Explain the situation or task so others understand the context):
    FIS

    Action (Give details about what you or another person did to handle the situation):

    Result (Describe what was achieved by the action and why it was effective):

    Sunday, August 25, 2024

    Authentication & Authorization

    Key words 

    Knowledge Factor

    Possession Factor

    Inherence Factor

    All individual factors are not secure, because if any factor comprimise.

    So, we got 2FA or MFA came into picture.


    SSO

    Oauth

    JWT (JSon Web Token) ==> not authentication, it's an authorization.

    Okta

    OpenConnectID

    SAML (Secure Assertion Markup Language)

    Azure Active Directory

    Service Provider

    IdentityProvider




    Eg: AccessToken --> Employee ID card ( Authorization), validatity( lost or reported, left the orginations, Expire) 



    Session or Cookie Based Authorization

    In Cookie, will maintain SESSION_ID


    Load balance will maintain the session log in sticky Session pattern ( Scalability problem)


    Here also, we have a problem if the shared session cache pattern is corrupted then it's single point failure




    User is responsible to bring all security information for all microservices, so nothing but JWT

    For Stateless server, will pass JWT 






    OAuth:
    Service to service authorization 
    ResourceOwner
    ResourceServer
    AuthorizationServer
    Client








    with in same with diff microservices.

    OpenIDConnect
    Its a specification






    Security Certificates

      1. Cryptography Basics Understand Key Concepts : Encryption, decryption, hashing, and digital signatures. Key terms: confidentiality, inte...