discord

Discord’s Powerful Cross-Platform Chat: Ready for Your Game (opens in new tab)

Discord has officially moved its Social SDK communication features out of closed beta, making integrated voice and text chat available to all game developers. By bringing these native Discord features directly into the game environment, the SDK aims to foster deeper player connections and increase session lengths through improved multiplayer interactions. This release marks a significant step in streamlining social connectivity, allowing studios to leverage Discord’s infrastructure without forcing players to leave the game client. ### Expanding In-Game Communication * Developers can now fully implement Discord-powered voice and text chat features within their titles. * The SDK is designed to enhance the multiplayer experience by providing high-quality, reliable communication tools that are synonymous with the Discord platform. * Initially introduced at GDC, these features are intended to maximize player engagement by making social interaction a core part of the gameplay loop. ### Frictionless Player Connectivity * The SDK allows players to connect with friends and join multiplayer sessions even if they do not currently have a Discord account. * By removing barriers to entry, the tools help players find new teammates and build communities more easily within the game. * Integration focuses on creating "meaningful multiplayer interactions" that contribute to higher player retention and longer-term interest in the game. For developers seeking to build a robust social layer into their games, the Discord Social SDK offers a proven communication stack that functions independently of external account requirements, ensuring a broader reach for community-building efforts.

discord

Introducing the Community Server Cleanup Report for August 2025 (opens in new tab)

Discord is addressing long-standing community management challenges by launching a series of updates designed to empower server moderators and game developers. Through the formation of a new dedicated engineering team, the platform aims to provide more granular control and resolve common pain points for high-growth community builders. These improvements represent the initial phase of a broader commitment to enhancing server health and administrative efficiency. **Strategic Focus on Community Control** * Discord has established a specialized team focused exclusively on providing server leaders with more power and reducing administrative friction. * The development roadmap is currently prioritizing a backlog of legacy requests from community managers and game developers. * The initiative focuses on creating a more stable environment for "healthy and active" servers through improved backend support and feature sets. **Empowering Developers and Moderators** * New updates are being released in "waves," with this first installment focusing on the core tools needed to spin up and maintain large communities. * The platform aims to reduce the necessity for external workarounds by integrating requested fixes directly into the Discord interface. * Special attention is being given to game developer-led communities to ensure they have the specific tools required to manage official brand spaces. Community administrators should stay tuned for subsequent update waves as Discord continues to roll out features from their dedicated community-power backlog. Keeping an eye on these native tool improvements will likely reduce the reliance on third-party moderation bots and manual administrative overhead.

discord

Discord for Business Vol. 2: Cannes-worthy ad product updates (opens in new tab)

Discord's debut at the Cannes Lions festival marks a significant strategic milestone in the expansion of its advertising business and brand partnership ecosystem. By championing an opt-in, community-first model, the platform aims to redefine how advertisers engage with a digital-native audience that prioritizes privacy and authentic interaction. The central conclusion from their industry showcase is that gaming has moved into the cultural mainstream, positioning Discord as the primary hub for audience influence and peer-to-peer communication. ### Strategic Advertising and Industry Partnerships * Discord is actively scaling its ad business by moving away from traditional intrusive formats toward a model based on user consent and community integration. * High-level panels featuring leadership from Xbox, Kantar, and Unilever underscore Discord's growing legitimacy as a critical platform for global brand strategy. * The platform’s evolution focuses on providing businesses with measurable opportunities to connect with high-intent users within their own social environments. ### The Mainstreaming of Gaming Communities * Gaming has transcended its niche origins to become a dominant cultural force, requiring brands to adapt to new methods of digital social interaction. * Discord serves as the "digital third place" where modern gamers talk, share content, and influence one another’s purchasing decisions in real-time. * The community-first approach allows brands to move past broad-reach tactics and instead foster deeper, more meaningful connections within specific interest groups. To effectively reach modern gaming audiences, brands should transition from traditional broadcast advertising toward community-centric engagement. Leveraging Discord’s opt-in framework allows companies to build long-term loyalty by participating in the authentic conversations already happening within these influential digital spaces.

google

From massive models to mobile magic: The tech behind YouTube real-time generative AI effects (opens in new tab)

YouTube has successfully deployed over 20 real-time generative AI effects by distilling the capabilities of massive cloud-based models into compact, mobile-ready architectures. By utilizing a "teacher-student" training paradigm, the system overcomes the computational bottlenecks of high-fidelity generative AI while ensuring the output remains responsive on mobile hardware. This approach allows for complex transformations, such as cartoon style transfer and makeup application, to run frame-by-frame on-device without sacrificing the user’s identity. ### Data Curation and Diversity * The foundation of the effects pipeline relies on high-quality, properly licensed face datasets. * Datasets are meticulously filtered to ensure a uniform distribution across different ages, genders, and skin tones. * The Monk Skin Tone Scale is used as a benchmark to ensure the effects work equitably for all users. ### The Teacher-Student Framework * **The Teacher:** A large, powerful pre-trained model (initially StyleGAN2 with StyleCLIP, later transitioning to Google DeepMind’s Imagen) acts as the "expert" that generates high-fidelity visual effects. * **The Student:** A lightweight UNet-based architecture designed for mobile efficiency. It utilizes a MobileNet backbone for both the encoder and decoder to ensure fast frame-by-frame processing. * The distillation process narrows the scope of the massive teacher model into a student model focused on a single, specific task. ### Iterative Distillation and Training * **Data Generation:** The teacher model processes thousands of images to create "before and after" pairs. These are augmented with synthetic elements like AR glasses, sunglasses, and hand occlusions to improve real-world robustness. * **Optimization:** The student model is trained using a sophisticated combination of loss functions, including L1, LPIPS, Adaptive, and Adversarial loss, to balance numerical accuracy with aesthetic quality. * **Architecture Search:** Neural architecture search is employed to tune "depth" and "width" multipliers, identifying the most efficient model structure for different mobile hardware constraints. ### Addressing the Inversion Problem * A major challenge in real-time effects is the "inversion problem," where the model struggles to represent a real face in latent space, leading to a loss of the user's identity (e.g., changes in skin tone or clothing). * YouTube uses Pivotal Tuning Inversion (PTI) to ensure that the user's specific features are preserved during the generative process. * By editing images in the latent space—a compressed numerical representation—the system can apply stylistic changes while maintaining the core characteristics of the original video stream. By combining advanced model distillation with on-device optimization via MediaPipe, YouTube demonstrates a practical path for bringing heavy generative AI research into consumer-facing mobile applications.

google

Securing private data at scale with differentially private partition selection (opens in new tab)

Google Research has introduced a novel parallel algorithm called MaxAdaptiveDegree (MAD) to enhance differentially private (DP) partition selection, a critical process for identifying common data items in massive datasets without compromising individual privacy. By utilizing an adaptive weighting mechanism, the algorithm optimizes the utility-privacy trade-off, allowing researchers to safely release significantly more data than previous non-adaptive methods. This breakthrough enables privacy-preserving analysis on datasets containing hundreds of billions of items, scaling up to three orders of magnitude larger than existing sequential approaches. ## The Role of DP Partition Selection * DP partition selection identifies a meaningful subset of unique items from large collections based on their frequency across multiple users. * The process ensures that no single individual's data can be identified in the final list by adding controlled noise and filtering out items that are not sufficiently common. * This technique is a foundational step for various machine learning tasks, including extracting n-gram vocabularies for language models, analyzing private data streams, and increasing efficiency in private model fine-tuning. ## The Weight, Noise, and Filter Paradigm * The standard approach to private partition selection begins by computing a "weight" for each item, typically representing its frequency, while ensuring "low sensitivity" so no single user has an outsized impact. * Random Gaussian noise is added to these weights to obfuscate exact counts, preventing attackers from inferring the presence of specific individuals. * A threshold determined by DP parameters is then applied; only items whose noisy weights exceed this threshold are included in the final output. ## Improving Utility via Adaptive Weighting * Traditional non-adaptive methods often result in "wastage," where highly popular items receive significantly more weight than necessary to cross the selection threshold. * The MaxAdaptiveDegree (MAD) algorithm introduces adaptivity by identifying items with excess weight and rerouting that weight to "under-allocated" items sitting just below the threshold. * This strategic reallocation allows a larger number of less-frequent items to be safely released, significantly increasing the utility of the dataset without compromising privacy or computational efficiency. ## Scalability and Parallelization * Unlike sequential algorithms that process data one piece at a time, MAD is designed as a parallel algorithm to handle the scale of modern user-based datasets. * The algorithm can process datasets with hundreds of billions of items by breaking the problem down into smaller parts computed simultaneously across multiple processors. * Google has open-sourced the implementation on GitHub to provide the research community with a tool that maintains robust privacy guarantees even at a massive scale. Researchers and data scientists working with large-scale sensitive datasets should consider implementing the MaxAdaptiveDegree algorithm to maximize the amount of shareable data while strictly adhering to user-level differential privacy standards.

line

Case Study: Improving Video (opens in new tab)

Engineers at LINE identified a recurring monthly degradation in video call quality, specifically in Japan, where packet loss increased and frames per second (FPS) dropped toward the end of each month. Investigation revealed that this pattern was caused by mobile ISP bitrate throttling once users exhausted their monthly data caps, which the existing congestion control mechanisms were failing to handle efficiently. To resolve this, the team improved their proprietary CCFS (Congestion Control based on Forward path Status) algorithm to more accurately detect these specific network constraints and maintain stable playback. ### Analysis of Monthly Quality Degradation * Data analysis showed a "monthly cycle" where video decoding FPS was highest at the start of the month and progressively declined toward the end. * This quality drop was specifically tied to an increase in video packet loss, which prevents normal decoding and results in stuttering or frozen frames. * Statistical segmentation revealed the issue occurred almost exclusively on 4G mobile networks rather than Wi-Fi, and was more pronounced in high-bitrate video calls than in voice calls. * The root cause was identified as mobile data plan policies; as users hit their monthly data limits, ISPs impose speed restrictions that create network congestion if the application continues to send high-bitrate data. ### Limitations of Standard Congestion Control * While the IETF RMCAT working group has standardized algorithms like NADA (RFC8698) and SCReAM (RFC8298), real-time two-way communication requires more sensitive response times than one-way streaming. * In two-way calls, even a one-second delay makes natural conversation difficult, meaning the system cannot rely on large buffers to smooth out network instability. * Existing mechanisms were not reacting fast enough to the rigid throughput limits imposed by carrier throttling, leading to packet accumulation in network queues and subsequent loss. ### The CCFS Proprietary Algorithm * LINE utilizes a custom-developed, sender-based algorithm called CCFS (Congestion Control based on Forward path Status). * Unlike older algorithms that rely on Round Trip Time (RTT), CCFS focuses on the "forward path"—the actual path packets take to the receiver—by analyzing feedback on packet arrival times and loss. * CCFS categorizes network status into four distinct states: Default, Probing, Throttled, and Competing. * The system monitors "delay variation"; when it detects a continuous increase in delay exceeding a specific threshold, it transitions to the "Throttled" state to proactively reduce bitrate before the queue overflows. ### Strategies for Quality Improvement * The team focused on refining how CCFS handles the transition into the Throttled state to better align with the artificial bandwidth ceilings created by ISPs. * By improving the sensitivity of forward path status monitoring, the application can more rapidly adjust its transmission rate to stay within the user's current data plan limits. * This technical adaptation ensures that even when a user's mobile speed is restricted, the video remains smooth, albeit at a lower resolution, rather than breaking up due to packet loss. To provide a high-quality communication experience, developers must account for external factors like regional ISP policies. Refining proprietary congestion control algorithms to detect specific patterns, such as monthly data-cap throttling, allows for a more resilient service that maintains stability across diverse mobile environments.

google

Beyond billion-parameter burdens: Unlocking data synthesis with a conditional generator (opens in new tab)

The CTCL (Data Synthesis with ConTrollability and CLustering) framework provides a lightweight alternative to the computationally expensive process of fine-tuning billion-parameter models for differentially private synthetic data generation. By utilizing a 140-million parameter generator and a universal topic model, the system achieves high-quality distribution matching while remaining accessible for resource-constrained applications. This approach allows for the generation of unlimited synthetic samples without incurring additional privacy costs, consistently outperforming existing API-based and large-scale baselines under strict privacy guarantees. ### Pre-training Universal Components The framework relies on two core components developed using large-scale public corpora, which can be reused across different private domains: * **CTCL-Topic:** A universal topic model derived from Wikipedia documents. It uses BERTopic to embed and cluster data into approximately 1,000 distinct topics, each represented by 10 descriptive keywords. * **CTCL-Generator:** A conditional language model based on the 140M-parameter BART-base architecture. It was pre-trained on 430 million description–document pairs from the SlimPajama dataset, with descriptions generated by Gemma-2-2B to ensure the model can generate text based on specific input conditions. ### Learning the Private Domain Once the universal components are established, the framework learns the specific characteristics of a private dataset through a two-step process: * **Differentially Private (DP) Histograms:** The system captures high-level distributional information by creating a DP-protected histogram that represents the percentage of each topic present in the private corpus. * **DP Fine-Tuning:** Each document in the private dataset is associated with its corresponding keywords from the CTCL-Topic model. The CTCL-Generator is then fine-tuned on these keyword-document pairs using differential privacy to ensure individual data points are protected. ### Controllable Data Generation The final stage involves producing the synthetic dataset by sampling from the fine-tuned generator: * **Proportional Sampling:** The system generates data by targeting the exact topic proportions found in the private domain histogram. * **Keyword Conditioning:** For each topic, the model uses the associated 10 keywords as input to prompt the DP fine-tuned generator to produce relevant documents. * **Post-Processing Efficiency:** Because the generator is already fine-tuned with DP, the framework can generate an unlimited number of synthetic samples without further privacy budget expenditure, a significant advantage over iterative selection algorithms. CTCL offers a highly scalable and efficient solution for organizations needing to synthesize private text data without the infrastructure requirements of massive LLMs. Its ability to maintain topic-wise distribution through keyword conditioning makes it an ideal choice for specialized domains where maintaining the statistical utility of the data is as critical as protecting user privacy.

line

The Current State of LY Corporation (opens in new tab)

Tech-Verse 2025 showcased LY Corporation’s strategic shift toward an AI-integrated ecosystem following the merger of LINE and Yahoo Japan. The event focused on the practical hurdles of deploying generative AI, concluding that the transition from experimental models to production-ready services requires sophisticated evaluation frameworks and deep contextual integration into developer workflows. ## AI-Driven Engineering with Ark Developer LY Corporation’s internal "Ark Developer" solution demonstrates how AI can be embedded directly into the software development life cycle. * The system utilizes a Retrieval-Augmented Generation (RAG) based code assistant to handle tasks such as code completion, security reviews, and automated test generation. * Rather than treating codebases as simple text documents, the tool performs graph analysis on directory structures to maintain structural context during code synthesis. * Real-world application includes a seamless integration with GitHub for automated Pull Request (PR) creation, with internal users reporting higher satisfaction compared to off-the-shelf tools like GitHub Copilot. ## Quantifying Quality in Generative AI A significant portion of the technical discussion centered on moving away from subjective "vibes-based" assessments toward rigorous, multi-faceted evaluation of AI outputs. * To measure the quality of generated images, developers utilized traditional metrics like Fréchet Inception Distance (FID) and Inception Score (IS) alongside LAION’s Aesthetic Score. * Advanced evaluation techniques were introduced, including CLIP-IQA, Q-Align, and Visual Question Answering (VQA) based on video-language models to analyze image accuracy. * Technical challenges in image translation and inpainting were highlighted, specifically the difficulty of restoring layout and text structures naturally after optical character recognition (OCR) and translation. ## Global Technical Exchange and Implementation The conference served as a collaborative hub for engineers across Japan, Taiwan, and Korea to discuss the implementation of emerging standards like the Model Context Protocol (MCP). * Sessions emphasized the "how-to" of overcoming deployment hurdles rather than just following technical trends. * Poster sessions (Product Street) and interactive Q&A segments allowed developers to share localized insights on LLM agent performance and agentic workflows. * The recurring theme across diverse teams was that the "evaluation and verification" stage is now the primary driver of quality in generative AI services. For organizations looking to scale AI, the key recommendation is to move beyond simple implementation and invest in "evaluation-driven development." By building internal tools that leverage graph-based context and quantitative metrics like Aesthetic Scores and VQA, teams can ensure that generative outputs meet professional service standards.

google

Enabling physician-centered oversight for AMIE (opens in new tab)

Guardrailed-AMIE (g-AMIE) is a diagnostic AI framework designed to perform patient history-taking while strictly adhering to safety guardrails that prevent it from providing direct medical advice. By decoupling data collection from clinical decision-making, the system enables an asynchronous oversight model where primary care physicians (PCPs) review and finalize AI-generated medical summaries. In virtual clinical trials, g-AMIE’s diagnostic outputs and patient communications were preferred by overseeing physicians and patient actors over human-led control groups. ## Multi-Agent Architecture and Guardrails * The system utilizes a multi-agent setup powered by Gemini 2.0 Flash, consisting of a dialogue agent, a guardrail agent, and a SOAP note agent. * The dialogue agent conducts history-taking in three distinct phases: general information gathering, targeted validation of a differential diagnosis, and a conclusion phase for patient questions. * A dedicated guardrail agent monitors and rephrases responses in real-time to ensure the AI abstains from sharing individualized diagnoses or treatment plans directly with the patient. * The SOAP note agent employs sequential multi-step generation to separate summarization tasks (Subjective and Objective) from more complex inferential tasks (Assessment and Plan). ## The Clinician Cockpit and Asynchronous Oversight * To facilitate human review, researchers developed the "clinician cockpit," a web interface co-designed with outpatient physicians through semi-structured interviews. * The interface is structured around the standard SOAP note format, presenting the patient’s perspective, measurable data, differential diagnosis, and proposed management strategy. * This framework allows overseeing PCPs to review cases asynchronously, editing the AI’s proposed differential diagnoses and management plans before sharing a final message with the patient. * The separation of history-taking from decision-making ensures that licensed medical professionals retain ultimate accountability for patient care. ## Performance Evaluation via Virtual OSCE * The system was evaluated in a randomized, blinded virtual Objective Structured Clinical Examination (OSCE) involving 60 case scenarios. * g-AMIE’s performance was compared against primary care physicians, nurse practitioners, and physician assistants who were required to operate under the same restrictive guardrails. * Overseeing PCPs and independent physician raters preferred g-AMIE’s diagnostic accuracy and management plans over those of the human control groups. * Patient actors reported a preference for the messages generated by g-AMIE compared to those drafted by human clinicians in the study. While g-AMIE demonstrates high potential for human-AI collaboration in diagnostics, the researchers emphasize that results should be interpreted with caution. The workflow was specifically optimized for AI characteristics, and human clinicians may require specialized training to perform effectively within such highly regulated guardrail frameworks.