Supervised learning

In machine learning, supervised learning (SL) is a type of machine learning paradigm where an algorithm learns to map input data to a specific output based on example input-output pairs. This process involves training a statistical model using labeled data, meaning each piece of input data is provided with the correct output. The term "supervised" refers to the role of a teacher or supervisor who provides this training data, guiding the algorithm towards correct predictions. For instance, if you want a model to identify cats in images, supervised learning would involve feeding it many images of cats (inputs) that are explicitly labeled "cat" (outputs). The goal of supervised learning is for the trained model to accurately predict the output for new, unseen data. This requires the algorithm to effectively generalize from the training examples, a quality measured by its generalization error. Supervised learning is commonly used for tasks like classification (predicting a category, e.g., spam or not spam) and regression (predicting a continuous value, e.g., house prices). == Steps to follow == To solve a given problem of supervised learning, the following steps must be performed: Determine the type of training samples. Before doing anything else, the user should decide what kind of data is to be used as a training set. In the case of handwriting analysis, for example, this might be a single handwritten character, an entire handwritten word, an entire sentence of handwriting, or a full paragraph of handwriting. Gather a training set. The training set needs to be representative of the real-world use of the function. Thus, a set of input objects is gathered together with corresponding outputs, either from human experts or from measurements. Determine the input feature representation of the learned function. The accuracy of the learned function depends strongly on how the input object is represented. Typically, the input object is transformed into a feature vector, which contains a number of features that are descriptive of the object. The number of features should not be too large, because of the curse of dimensionality; but should contain enough information to accurately predict the output. Determine the structure of the learned function and corresponding learning algorithm. For example, one may choose to use support-vector machines or decision trees. Complete the design. Run the learning algorithm on the gathered training set. Some supervised learning algorithms require the user to determine certain control parameters. These parameters may be adjusted by optimizing performance on a subset (called a validation set) of the training set, or via cross-validation. Evaluate the accuracy of the learned function. After parameter adjustment and learning, the performance of the resulting function should be measured on a test set that is separate from the training set. == Algorithm choice == A wide range of supervised learning algorithms are available, each with its strengths and weaknesses. There is no single learning algorithm that works best on all supervised learning problems (see the No free lunch theorem). There are four major issues to consider in supervised learning: === Bias–variance tradeoff === A first issue is the tradeoff between bias and variance. Imagine that we have available several different, but equally good, training data sets. A learning algorithm is biased for a particular input x {\displaystyle x} if, when trained on each of these data sets, it is systematically incorrect when predicting the correct output for x {\displaystyle x} . A learning algorithm has high variance for a particular input x {\displaystyle x} if it predicts different output values when trained on different training sets. The prediction error of a learned classifier is related to the sum of the bias and the variance of the learning algorithm. Generally, there is a tradeoff between bias and variance. A learning algorithm with low bias must be "flexible" so that it can fit the data well. But if the learning algorithm is too flexible, it will fit each training data set differently, and hence have high variance. A key aspect of many supervised learning methods is that they are able to adjust this tradeoff between bias and variance (either automatically or by providing a bias/variance parameter that the user can adjust). === Function complexity and amount of training data === The second issue is of the amount of training data available relative to the complexity of the "true" function (classifier or regression function). If the true function is simple, then an "inflexible" learning algorithm with high bias and low variance will be able to learn it from a small amount of data. But if the true function is highly complex (e.g., because it involves complex interactions among many different input features and behaves differently in different parts of the input space), then the function will only be able to learn with a large amount of training data paired with a "flexible" learning algorithm with low bias and high variance. === Dimensionality of the input space === A third issue is the dimensionality of the input space. If the input feature vectors have large dimensions, learning the function can be difficult even if the true function only depends on a small number of those features. This is because the many "extra" dimensions can confuse the learning algorithm and cause it to have high variance. Hence, input data of large dimensions typically requires tuning the classifier to have low variance and high bias. In practice, if the engineer can manually remove irrelevant features from the input data, it will likely improve the accuracy of the learned function. In addition, there are many algorithms for feature selection that seek to identify the relevant features and discard the irrelevant ones. This is an instance of the more general strategy of dimensionality reduction, which seeks to map the input data into a lower-dimensional space prior to running the supervised learning algorithm. === Noise in the output values === A fourth issue is the degree of noise in the desired output values (the supervisory target variables). If the desired output values are often incorrect (because of human error or sensor errors), then the learning algorithm should not attempt to find a function that exactly matches the training examples. Attempting to fit the data too carefully leads to overfitting. You can overfit even when there are no measurement errors (stochastic noise) if the function you are trying to learn is too complex for your learning model. In such a situation, the part of the target function that cannot be modeled "corrupts" your training data – this phenomenon has been called deterministic noise. When either type of noise is present, it is better to go with a higher bias, lower variance estimator. In practice, there are several approaches to alleviate noise in the output values such as early stopping to prevent overfitting as well as detecting and removing the noisy training examples prior to training the supervised learning algorithm. There are several algorithms that identify noisy training examples and removing the suspected noisy training examples prior to training has decreased generalization error with statistical significance. === Other factors to consider === Other factors to consider when choosing and applying a learning algorithm include the following: Heterogeneity of the data. If the feature vectors include features of many different kinds (discrete, discrete ordered, counts, continuous values), some algorithms are easier to apply than others. Many algorithms, including support-vector machines, linear regression, logistic regression, neural networks, and nearest neighbor methods, require that the input features be numerical and scaled to similar ranges (e.g., to the [-1,1] interval). Methods that employ a distance function, such as nearest neighbor methods and support-vector machines with Gaussian kernels, are particularly sensitive to this. An advantage of decision trees is that they easily handle heterogeneous data. Redundancy in the data. If the input features contain redundant information (e.g., highly correlated features), some learning algorithms (e.g., linear regression, logistic regression, and distance-based methods) will perform poorly because of numerical instabilities. These problems can often be solved by imposing some form of regularization. Presence of interactions and non-linearities. If each of the features makes an independent contribution to the output, then algorithms based on linear functions (e.g., linear regression, logistic regression, support-vector machines, naive Bayes) and distance functions (e.g., nearest neighbor methods, support-vector machines with Gaussian kernels) generally perform well. However, if there are complex interactions among features, then algorithms such as decision trees and neural networks work better, becaus

GamePigeon

GamePigeon is a mobile app for iOS devices, developed by Vitalii Zlotskii and released on September 13, 2016. The game takes advantage of the iOS 10 update, which expanded how users could interact with Apple's Messages app. GamePigeon is only available through the Messages app, which allows players to start and respond to different party games in conversations. == Release == The app was first released on September 13, 2016, coinciding with the launch of iOS 10. The app was released for free, although it includes in-app purchases to unlock additional items, such as cosmetic skins, avatar items, new game modes, and an option to remove ads. == Games in the app == The following is a list of games that users can play within GamePigeon: Sources: Poker was one of the games included in GamePigeon at launch, although it has since been removed and is no longer listed on the game's App Store description. == Reception == GamePigeon has enjoyed commercial success, with VentureBeat noting that GamePigeon was ranked number-one in the "Top Free" category of the iMessage App Store, six months after its release. Critically, GamePigeon has been generally well received, being highlighted by online media publications early on shortly after the iOS 10 launch. It has since been included on many "best iMessage apps" lists. Based on over 162,000 ratings, the game holds a 4.0 out of 5 rating on the App Store. Julian Chokkattu of Digital Trends wrote "GamePigeon should be like the pre-installed versions of Solitaire and Minesweeper that used to come with older iterations of Windows." On its launch day, Boy Genius Report included it on a list of "10 of the best iMessage apps, games and stickers for iOS 10 on launch day." The Daily Dot wrote, "GamePigeon is easily the best current gaming option within iMessages." 8-ball and cup pong have been particularly well received by media outlets. The Daily Dot had specific praise for the app's billiards game: "8-Ball controls shockingly smoothly with your fingers, and there’s nothing quite like destroying a dear friend in poker." During his 2020 U.S. presidential campaign, Cory Booker was cited as playing the game with his family. In 2017, CNBC cited one teenager who expressed that GamePigeon was one of just a few reasons that those in her age range use the iMessage app. The game has received particular positive reception for allowing introverted individuals to exercise a form social activity; similarly, the game was highlighted as a way to maintain social distancing guidelines during the COVID-19 pandemic. As an April Fools' Day joke in 2020, The Chronicle, a Duke University newspaper, published that Duke's athletic program adopted GamePigeon's Cup Pong as an official varsity sport.

Content adaptation

Content adaptation is the action of transforming content to adapt to device capabilities. Content adaptation is usually related to mobile devices, which require special handling because of their limited computational power, small screen size, and constrained keyboard functionality. Content adaptation could roughly be divided to two fields: Media content adaptation that adapts media files. Browsing content adaptation that adapts a website to mobile devices. == Browsing content adaptation == Advances in the capabilities of small, mobile devices such as mobile phones (cell phones) and Personal Digital Assistants have led to an explosion in the number of types of device that can now access the Web. Some commentators refer to the Web that can be accessed from mobile devices as the Mobile Web. The sheer number and variety of Web-enabled devices poses significant challenges for authors of websites who want to support access from mobile devices. The W3C Device Independence Working Group described many of the issues in its report Authoring Challenges for Device Independence. Content adaptation is one approach to a solution. Rather than requiring authors to create pages explicitly for each type of device that might request them, content adaptation transforms an author's materials automatically. For example, content might be converted from a device-independent markup language, such as XDIME, an implementation of the W3C's DIAL specification, into a form suitable for the device, such as XHTML Basic, C-HTML, or WML. Similarly, a suitable device-specific CSS style sheet or a set of in-line styles might be generated from abstract style definitions. Likewise, a device specific layout might be generated from abstract layout definitions. Once created, the device-specific materials form the response returned to the device from which the request was made. Another way is to use the latest trend responsive design based on CSS, covered in this article (RWD). Content adaptation requires a processor that performs the selection, modification, and generation of materials to form the device-specific result. IBM's Websphere Everyplace Mobile Portal (WEMP), BEA Systems' WebLogic Mobility Server, Morfeo's MyMobileWeb, and Apache Cocoon are examples of such processors. Wurfl and WALL are popular open source tools for content adaptation. WURFL is an XML-based Device Description Repository with APIs to access the data in Java and PHP (and other popular programming languages). WALL (Wireless Abstraction Library) lets a developer author mobile pages which look like plain HTML, but converts them to WML, C-HTML, or XHTML Mobile Profile, depending on the capabilities of the device from which the HTTP request originates. GreasySpoon lets the developer build plugins for content editing, in JavaScript, Ruby (programming language), and more, just like the Firefox application GreaseMonkey. Alembik (Media Transcoding Server) is a Java (J2EE) application providing transcoding services for variety of clients and for different media types (image, audio, video, etc.). It is fully compliant with OMA's Standard Transcoder Interface specification and is distributed under the LGPL open source license. In 2007, the first large scale carrier-grade deployments of content transformation, on existing mass-market handsets, with no software download required, were deployed by Vodafone in the UK and globally for Yahoo! oneSearch, using the Novarra Vision solution. Novarra's content adaptation solution had been used in enterprise intranet deployments as early as 2003 (at that time, the platform was named "Engines for Wireless Data"). InfoGin, the 9-year-old content-adaptation company with customers like Vodafone, Orange, Telefónica and PCCW. The patented "Web to Mobile adaptation", Mobile Matrix Transcoder, Multimedia and Documents transcoders, Video adaptation supporte. Launched in 2007, Bytemobile's Web Fidelity Service was another carrier-grade, commercial infrastructure solution, which provided wireless content adaptation to mobile subscribers on their existing mass-market handsets, with no client download required.

VHS

VHS (Video Home System) is a discontinued standard for consumer-level analog video recording on tape cassettes, introduced in 1976 by JVC. It was the dominant home video format throughout the tape media period of the 1980s and 1990s. Magnetic tape video recording was adopted by the television industry in the 1950s in the form of the first commercialized video tape recorders (VTRs), but the devices were expensive and used only in professional environments. In the 1970s, videotape technology became affordable for home use, and widespread adoption of videocassette recorders (VCRs) began; the VHS became the most popular media format for VCRs as it would win the "format war" against Betamax (backed by Sony) and a number of other competing tape standards. The cassettes themselves use a 0.5-inch (12.7 mm) magnetic tape between two spools and typically offer a capacity of at least two hours. The popularity of VHS was intertwined with the rise of the video rental market, when films were released on pre-recorded videotapes for home viewing. Newer improved tape formats such as S-VHS were later developed, as well as the earliest optical disc format, LaserDisc; the lack of global adoption of these formats increased VHS's lifetime, which eventually peaked and started to decline in the late 1990s after the introduction of DVD, a digital optical disc format. VHS rentals were surpassed by DVD in the United States in 2003, which eventually became the preferred low-end method of movie distribution. For home recording purposes, VHS and VCRs were surpassed by (typically hard disk–based) digital video recorders (DVR) in the 2000s. Production of all VHS equipment ceased by 2016, although the format has since gained some popularity amongst collectors. A niche revival of VHS has taken place with This Is How The World Ends becoming the first straight-to-VHS release in 20 years. == History == === Before VHS === In 1956, after several attempts by other companies, the first commercially successful VTR, the Ampex VRX-1000, was introduced by Ampex Corporation. At a price of US$50,000 in 1956 (equivalent to $592,000 in 2025) and US$300 (equivalent to $3,600 in 2025) for a 90-minute reel of tape, it was intended only for the professional market. Kenjiro Takayanagi, a television broadcasting pioneer then working for JVC as its vice president, saw the need for his company to produce VTRs for the Japanese market at a more affordable price. In 1959, JVC developed a two-head video tape recorder and, by 1960, a color version for professional broadcasting. In 1964, JVC released the DV220, which would be the company's standard VTR until the mid-1970s. In 1969, JVC collaborated with Sony and Matsushita Electric (Matsushita was the majority stockholder of JVC until 2011) to build a video recording standard for the Japanese consumer. The effort produced the U-matic format in 1971, which was the first cassette format to become a unified standard for different companies. It was preceded by the reel-to-reel 1⁄2-inch EIAJ format. The U-matic format was successful in businesses and some broadcast television applications, such as electronic news-gathering, and was produced by all three companies until the late 1980s, but because of cost and limited recording time, very few of the machines were sold for home use. Therefore, soon after the U-Matic release, all three companies started working on new consumer-grade video recording formats of their own. Sony started working on Betamax, Matsushita started working on VX, and JVC released the CR-6060 in 1975, based on the U-matic format. === VHS development === In 1971, JVC engineers Yuma Shiraishi and Shizuo Takano put together a team to develop a VTR for consumers. By the end of 1971, they created an internal diagram, "VHS Development Matrix", which established twelve objectives for JVC's new VTR; among them: The system must be compatible with any ordinary television set. Picture quality must be similar to a normal air broadcast. The tape must have at least a two-hour recording capacity. Tapes must be interchangeable between machines. The overall system should be versatile, meaning it can be scaled and expanded, such as connecting a video camera, or dubbing between two recorders. Recorders should be affordable, easy to operate, and have low maintenance costs. Recorders must be capable of being produced in high volume, their parts must be interchangeable, and they must be easy to service. In early 1972, the commercial video recording industry in Japan took a financial hit. JVC cut its budgets and restructured its video division, shelving the VHS project. However, despite the lack of funding, Takano and Shiraishi continued to work on the project in secret. By 1973, the two engineers had produced a functional prototype. === Competition with Betamax === In 1974, the Japanese Ministry of International Trade and Industry (MITI), desiring to avoid consumer confusion, attempted to force the Japanese video industry to standardize on just one home video recording format. Later, Sony had a functional prototype of the Betamax format, and was very close to releasing a finished product. With this prototype, Sony persuaded the MITI to adopt Betamax as the standard, and allow it to license the technology to other companies. JVC believed that an open standard, with the format shared among competitors without licensing the technology, was better for the consumer. To prevent the MITI from adopting Betamax, JVC worked to convince other companies, in particular Matsushita (Japan's largest electronics manufacturer at the time, marketing its products under the National brand in most territories and the Panasonic brand in North America, and JVC's majority stockholder), to accept VHS, and thereby work against Sony and the MITI. Matsushita agreed, fearing Sony would dominate the market with a Betamax monopoly. Matsushita also regarded Betamax's one-hour recording time limit as a disadvantage. Matsushita's backing of JVC persuaded Hitachi, Mitsubishi, and Sharp to back the VHS standard as well. Sony's release of its Betamax unit to the Japanese market in 1975 placed further pressure on the MITI to side with the company. However, the collaboration of JVC and its partners was much stronger, which eventually led the MITI to drop its push for an industry standard. JVC released the first VHS machines in Japan in late 1976, and in the United States in mid-1977. Sony's Betamax competed with VHS throughout the late 1970s and into the 1980s (see Videotape format war). Betamax's major advantages were its smaller cassette size, theoretical higher video quality, and earlier availability, but its shorter recording time proved to be a major shortcoming. Originally, Beta I machines using the NTSC television standard were able to record one hour of programming at their standard tape speed of 1.5 inches per second (ips). The first VHS machines could record for two hours, due to both a slightly slower tape speed (1.31 ips) and significantly longer tape. Betamax's smaller cassette limited the size of the reel of tape, and could not compete with VHS's two-hour capability by extending the tape length. Instead, Sony had to slow the tape down to 0.787 ips (Beta II) in order to achieve two hours of recording in the same cassette size. Sony eventually created a Beta III speed of 0.524 ips, which allowed NTSC Betamax to break the two-hour limit, but by then VHS had already won the format battle. Additionally, VHS had a "far less complex tape transport mechanism" than Betamax, and VHS machines were faster at rewinding and fast-forwarding than their Sony counterparts. VHS eventually won the war, gaining 60% of the North American market by 1980. == Initial releases of VHS-based devices == The first VCR to use VHS was the Victor HR-3300, and was introduced by the president of JVC in Japan on September 9, 1976. JVC started selling the HR-3300 in Akihabara, Tokyo, Japan, on October 31, 1976. Region-specific versions of the JVC HR-3300 were also distributed later on, such as the HR-3300U in the United States, and the HR-3300EK in the United Kingdom. The United States received its first VHS-based VCR, the RCA VBT200, on August 23, 1977. The RCA unit was designed by Matsushita and was the first VHS-based VCR manufactured by a company other than JVC. It was also capable of recording four hours in LP (long play) mode. The UK received its first VHS-based VCR, the Victor HR-3300EK, in 1978. Quasar and General Electric followed-up with VHS-based VCRs – all designed by Matsushita. By 1999, Matsushita alone produced just over half of all Japanese VCRs. TV/VCR combos, combining a TV set with a VHS mechanism, were also once available for purchase. Combo units containing both a VHS mechanism and a DVD player were introduced in the late 1990s, and at least one combo unit, the Panasonic DMP-BD70V, included a Blu-ray player. == Technical details == VHS has been standardized in IEC 60774–1. === Cassette and

Web science

Web science is an emerging interdisciplinary field concerned with the study of large-scale socio-technical systems, particularly the World Wide Web. It considers the relationship between people and technology, the ways that society and technology co-constitute one another and the impact of this co-constitution on broader society. Web Science combines research from disciplines as diverse as sociology, computer science, economics, and mathematics. The Web Science Institute, founded at the University of Southampton by director Wendy Hall and colleagues, describes Web Science as focusing "the analytical power of researchers from disciplines as diverse as mathematics, sociology, economics, psychology, law and computer science to understand and explain the Web. It is necessarily interdisciplinary – as much about social and organizational behaviour as about the underpinning technology." A central pillar of Web science development is Artificial Intelligence or "AI". The current artificial intelligence that in development at the moment is Human-Centered, with goals to further professional development courses as well as influencing public policy. Artificial intelligence developers are focused on the most impactful uses of this technology, while also hoping to expedite the growth and development of the human race. An early definition was given by American computer scientist Ben Shneiderman: "Web Science" is processing the information available on the web in similar terms to those applied to natural environment. == Areas of activity == === Emergent properties === Philip Tetlow, an IBM-based scientist influential in the emergence of web science as an independent discipline, argued for the concept of web life, which considers the Web not as a connected network of computers, as in common interpretations of the Internet, but rather as a sociotechnical machine capable of fusing together individuals and organisations into larger coordinated groups. It argues that unlike the technologies that have come before it, the Web is different in that its phenomenal growth and complexity are starting to outstrip our capability to control it directly, making it impossible for us to grasp its completeness in one go. Tetlow made use of Fritjof Capra's concept of the 'web of life' as a metaphor. == Research groups == There are numerous academic research groups engaged in Web Science research, many of which are members of WSTNet, the Web Science Trust Network of research labs. Health Web Science emerged as a sub-discipline of Web Science that studies the role of the Web's impact on human's health outcomes and how to further utilize the Web to improve health outcomes. These groups focus on the developmental possibilities, provided through Web Science, in areas such as health care and social welfare. Discussion of web science has been widely adopted as a method in which the internet can have a real world impact in the field of medicine, currently coined Medicine 2.0. The World Wide Web acts as a medium for the spread and circulation of knowledge, though these various research groups consider themselves responsible for maintaining verifiable and testable knowledge. Using their knowledge of the healthcare system as well as web science, researchers are focused on formatting and structuring their knowledge in a way that is easily accessible throughout the internet. The World Wide Web is quickly evolving meaning that the information we provide and its formatting must also. Recognizing the overlap between both aspects, the spread of knowledge and development of the internet, allows us to properly display our knowledge in a manner that evolves as quickly as the internet and everyday medical research. The accessibility of the internet and quick development of knowledge must be companied with efficient formatting to allocate successful dissemination of information, as described by these various researcher groups. == Related major conferences == Association for Computing Machinery (ACM), Hypertext Conference (HT) sponsored by SIGWEB ACM SIGCHI Conference on Human Factors in Computing Systems (CHI) International AAAI Conference on Weblogs and Social Media (ICWSM) The Web Conference (WWW) Association for Computing Machinery (ACM) Web Science Conference (WebSci)

Neural radiance field

A neural radiance field (NeRF) is a neural field for reconstructing a three-dimensional representation of a scene from two-dimensional images. The NeRF model enables downstream applications of novel view synthesis, scene geometry reconstruction, and obtaining the reflectance properties of the scene. Additional scene properties such as camera poses may also be jointly learned. First introduced in 2020, it has since gained significant attention for its potential applications in computer graphics and content creation. == Algorithm == The NeRF algorithm represents a scene as a radiance field parametrized by a deep neural network (DNN). The network predicts a volume density and view-dependent emitted radiance given the spatial location ( x , y , z ) {\displaystyle (x,y,z)} and viewing direction in Euler angles ( θ , Φ ) {\displaystyle (\theta ,\Phi )} of the camera. By sampling many points along camera rays, traditional volume rendering techniques can produce an image. === Data collection === A NeRF needs to be retrained for each unique scene. The first step is to collect images of the scene from different angles and their respective camera pose. These images are standard 2D images and do not require a specialized camera or software. Any camera is able to generate datasets, provided the settings and capture method meet the requirements for SfM (Structure from Motion). This requires tracking of the camera position and orientation, often through some combination of SLAM, GPS, or inertial estimation. Researchers often use synthetic data to evaluate NeRF and related techniques. For such data, images (rendered through traditional non-learned methods) and respective camera poses are reproducible and error-free. === Training === For each sparse viewpoint (image and camera pose) provided, camera rays are marched through the scene, generating a set of 3D points with a given radiance direction (into the camera). For these points, volume density and emitted radiance are predicted using the multi-layer perceptron (MLP). An image is then generated through classical volume rendering. Because this process is fully differentiable, the error between the predicted image and the original image can be minimized with gradient descent over multiple viewpoints, encouraging the MLP to develop a coherent model of the scene. == Variations and improvements == Early versions of NeRF were slow to optimize and required that all input views were taken with the same camera in the same lighting conditions. These performed best when limited to orbiting around individual objects, such as a drum set, plants or small toys. Since the original paper in 2020, many improvements have been made to the NeRF algorithm, with variations for special use cases. === Fourier feature mapping === In 2020, shortly after the release of NeRF, the addition of Fourier Feature Mapping improved training speed and image accuracy. Deep neural networks struggle to learn high frequency functions in low dimensional domains; a phenomenon known as spectral bias. To overcome this shortcoming, points are mapped to a higher dimensional feature space before being fed into the MLP. γ ( v ) = [ a 1 cos ⁡ ( 2 π B 1 T v ) a 1 sin ⁡ ( 2 π B 1 T v ) ⋮ a m cos ⁡ ( 2 π B m T v ) a m sin ⁡ ( 2 π B m T v ) ] {\displaystyle \gamma (\mathrm {v} )={\begin{bmatrix}a_{1}\cos(2{\pi }{\mathrm {B} }_{1}^{T}\mathrm {v} )\\a_{1}\sin(2\pi {\mathrm {B} }_{1}^{T}\mathrm {v} )\\\vdots \\a_{m}\cos(2{\pi }{\mathrm {B} }_{m}^{T}\mathrm {v} )\\a_{m}\sin(2{\pi }{\mathrm {B} }_{m}^{T}\mathrm {v} )\end{bmatrix}}} Where v {\displaystyle \mathrm {v} } is the input point, B i {\displaystyle \mathrm {B} _{i}} are the frequency vectors, and a i {\displaystyle a_{i}} are coefficients. This allows for rapid convergence to high frequency functions, such as pixels in a detailed image. === Bundle-adjusting neural radiance fields === One limitation of NeRFs is the requirement of knowing accurate camera poses to train the model. Often times, pose estimation methods are not completely accurate, nor is the camera pose even possible to know. These imperfections result in artifacts and suboptimal convergence. So, a method was developed to optimize the camera pose along with the volumetric function itself. Called Bundle-Adjusting Neural Radiance Field (BARF), the technique uses a dynamic low-pass filter (DLPF) to go from coarse to fine adjustment, minimizing error by finding the geometric transformation to the desired image. This corrects imperfect camera poses and greatly improves the quality of NeRF renders. === Multiscale representation === Conventional NeRFs struggle to represent detail at all viewing distances, producing blurry images up close and overly aliased images from distant views. In 2021, researchers introduced a technique to improve the sharpness of details at different viewing scales known as mip-NeRF (comes from mipmap). Rather than sampling a single ray per pixel, the technique fits a gaussian to the conical frustum cast by the camera. This improvement effectively anti-aliases across all viewing scales. mip-NeRF also reduces overall image error and is faster to converge at about half the size of ray-based NeRF. === Learned initializations === In 2021, researchers applied meta-learning to assign initial weights to the MLP. This rapidly speeds up convergence by effectively giving the network a head start in gradient descent. Meta-learning also allowed the MLP to learn an underlying representation of certain scene types. For example, given a dataset of famous tourist landmarks, an initialized NeRF could partially reconstruct a scene given one image. === NeRF in the wild === Conventional NeRFs are vulnerable to slight variations in input images (objects, lighting) often resulting in ghosting and artifacts. As a result, NeRFs struggle to represent dynamic scenes, such as bustling city streets with changes in lighting and dynamic objects. In 2021, researchers at Google developed a new method for accounting for these variations, named NeRF in the Wild (NeRF-W). This method splits the neural network (MLP) into three separate models. The main MLP is retained to encode the static volumetric radiance. However, it operates in sequence with a separate MLP for appearance embedding (changes in lighting, camera properties) and an MLP for transient embedding (changes in scene objects). This allows the NeRF to be trained on diverse photo collections, such as those taken by mobile phones at different times of day. === Relighting === In 2021, researchers added more outputs to the MLP at the heart of NeRFs. The output now included: volume density, surface normal, material parameters, distance to the first surface intersection (in any direction), and visibility of the external environment in any direction. The inclusion of these new parameters lets the MLP learn material properties, rather than pure radiance values. This facilitates a more complex rendering pipeline, calculating direct and global illumination, specular highlights, and shadows. As a result, the NeRF can render the scene under any lighting conditions with no re-training. === Plenoctrees === Although NeRFs had reached high levels of fidelity, their costly compute time made them useless for many applications requiring real-time rendering, such as VR/AR and interactive content. Introduced in 2021, Plenoctrees (plenoptic octrees) enabled real-time rendering of pre-trained NeRFs through division of the volumetric radiance function into an octree. Rather than assigning a radiance direction into the camera, viewing direction is taken out of the network input and spherical radiance is predicted for each region. This makes rendering over 3000x faster than conventional NeRFs. === Sparse Neural Radiance Grid === Similar to Plenoctrees, this method enabled real-time rendering of pretrained NeRFs. To avoid querying the large MLP for each point, this method bakes NeRFs into Sparse Neural Radiance Grids (SNeRG). A SNeRG is a sparse voxel grid containing opacity and color, with learned feature vectors to encode view-dependent information. A lightweight, more efficient MLP is then used to produce view-dependent residuals to modify the color and opacity. To enable this compressive baking, small changes to the NeRF architecture were made, such as running the MLP once per pixel rather than for each point along the ray. These improvements make SNeRG extremely efficient, outperforming Plenoctrees. === Instant NeRFs === In 2022, researchers at Nvidia enabled real-time training of NeRFs through a technique known as Instant Neural Graphics Primitives. An innovative input encoding reduces computation, enabling real-time training of a NeRF, an improvement orders of magnitude above previous methods. The speedup stems from the use of spatial hash functions, which have O ( 1 ) {\displaystyle O(1)} access times, and parallelized architectures which run fast on modern GPUs. == Related techniques == === Plenoxels === Plen

Nitro Zeus

Nitro Zeus is the project name for a well funded comprehensive cyber attack plan created as a mitigation strategy after the Stuxnet malware campaign and its aftermath. Unlike Stuxnet, that was loaded onto a system after the design phase to affect its proper operation, Nitro Zeus's objectives are built into a system during the design phase unbeknownst to the system users. This built-in feature allows a more assured and effective cyber attack against the system's users. The information about its existence was raised during research and interviews carried out by Alex Gibney for his Zero Days documentary film. The proposed long term widespread infiltration of major Iranian systems would disrupt and degrade communications, power grid, and other vital systems as desired by the cyber attackers. This was to be achieved by electronic implants in Iranian computer networks. The project was seen as one pathway in alternatives to full-scale war.

GamePigeon

Content adaptation

VHS

Web science

Neural radiance field

Nitro Zeus

Related Articles